{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2uELi9Ykzf04"
      },
      "source": [
        "# Generate molecules. Export molecules into csv files\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CoYpUwvw6rCM"
      },
      "source": [
        "**Codes in this notebook are created mainly by Haotian Cui, and with minor adaptation by Gen Li**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3k3JaqxJYFEV"
      },
      "source": [
        "Ps:Each block of code with no additional comments for modification is executed by clicking the Run button to the left of the block. THese cells must be executed in order\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Hqu9yJKR6GHC"
      },
      "source": [
        "- Installing the required package\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Xndmx2dW53gf",
        "outputId": "7abb4d00-2d2c-4b2d-ec77-38aae20fb4ef"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Requirement already satisfied: rdkit in /opt/miniconda3/envs/agile/lib/python3.10/site-packages (2025.3.2)\n",
            "Requirement already satisfied: numpy in /opt/miniconda3/envs/agile/lib/python3.10/site-packages (from rdkit) (2.2.5)\n",
            "Requirement already satisfied: Pillow in /opt/miniconda3/envs/agile/lib/python3.10/site-packages (from rdkit) (11.2.1)\n",
            "Note: you may need to restart the kernel to use updated packages.\n"
          ]
        }
      ],
      "source": [
        "pip install rdkit\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "x7no1TsqxQH2"
      },
      "source": [
        "- Connect to the gogle drive storage, only by executing this, you can read the file from the google drive"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "CiCKm0_uxPnU",
        "outputId": "de4576db-b4af-4f93-8da1-105bd493e9b5"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Mounted at /content/drive\n"
          ]
        }
      ],
      "source": [
        "# from google.colab import drive\n",
        "# drive.mount('/content/drive')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yAMAGM9_ZHDO"
      },
      "source": [
        "- Import required modules and set up the options"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "id": "bpIYV1JXzf07"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import warnings\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "from pathlib import Path\n",
        "from typing import Literal\n",
        "\n",
        "from rdkit import Chem\n",
        "from rdkit.Chem import AllChem, Descriptors, Draw, rdMolEnumerator\n",
        "from rdkit.Chem.Draw import IPythonConsole, rdMolDraw2D\n",
        "from rdkit.Chem.Draw import MolDrawing\n",
        "from rdkit.Chem import PandasTools\n",
        "from rdkit.Chem.Scaffolds import MurckoScaffold\n",
        "from pandas.errors import SettingWithCopyWarning\n",
        "from rdkit.Chem.EnumerateStereoisomers import EnumerateStereoisomers\n",
        "\n",
        "IPythonConsole.ipython_useSVG = True\n",
        "# support display images in pandas dataframe\n",
        "PandasTools.RenderImagesInAllDataFrames(images=True)\n",
        "\n",
        "warnings.simplefilter(action='ignore', category=SettingWithCopyWarning)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uo7szizfzf08"
      },
      "source": [
        "### 1. read 37 raw smiles from csv"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "L8BooTQXzf09"
      },
      "outputs": [],
      "source": [
        "def show_atom_number(\n",
        "    mol, label: Literal[\"atomLabel\", \"molAtomMapNumber\", \"atomNote\"] = \"atomNote\"\n",
        "):\n",
        "    for atom in mol.GetAtoms():\n",
        "        atom.SetProp(label, str(atom.GetIdx()))\n",
        "    return mol"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "C5h_FFDHZ6JG"
      },
      "source": [
        "- **Need to change the SMILES string for the core as you need**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "L59ZnTHlBS4N"
      },
      "outputs": [],
      "source": [
        "# define core smile string\n",
        "core_smiles = \"NCC(N)=O\"  # In the example, all four components are connected at N0, N1, C2, N3\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DUEq7BJzaYyn"
      },
      "source": [
        "- **Please change the position for attachments**\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "NwOQOwfYBkTW"
      },
      "outputs": [],
      "source": [
        "# define the postion in the core for the attachment of components\n",
        "core_A_pos = 0  # the position in core to attach component A, R1-NH2\n",
        "core_B_pos = 1  # the position in core to attach component B, R2-CHO\n",
        "core_C_pos = 3  # the position in core to attach component C, R3-NC"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "M5v_e-FGypGe"
      },
      "source": [
        "- Display the molecule image of the core"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 172
        },
        "id": "NIPya_xlzf09",
        "outputId": "2e674e7a-8447-4329-e20a-e9afa2a8bcbe"
      },
      "outputs": [
        {
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAIAAADCEh9HAAAABmJLR0QA/wD/AP+gvaeTAAAT7ElEQVR4nO3deVRU58EG8GcGcAQRF7CAigu4AIpRKSoqGte6YNJoIGkiOa1RjCYF234IieagaaQQs4Da0+KSRNM2Rmqr4tagUdQkCoTFBcEFEVQ0oFEQZFjm/f4AQc0gyx24M5fnd/gD7r3zzsM5+HiX995RCSFAREQtpZY7ABGRaWONEhFJwholIpKENUpEJAlrlIhIEtYoEZEkrFFqR5KTk6Oiok6cOCF3EFIUc7kDELWR0tLSl156ydLSsqysbPz48XLHIeVgjVJ7sWLFikmTJqnVPAIjA+OfFLULSUlJX375ZWRkpNxBSIFYo6R8VVVVixcvXrt2bY8ePeTOQgrEGiXli4iI6NKlS0BAgNxBSJl4bpSU78MPP6yurra1tQVQVlamUqnMzc3fffdduXORQqj4hCdqVxYtWtSzZ8/Vq1fLHYSUgwf11L5YWVl17NhR7hSkKNwbJSKShHujRESSsEaJiCRhjRIRScIaJSKShDVKRCQJa5SISBLexURKd/48PvsMR4/ixg2UlcHODu7umD0br70GjUbucKQEnDdKyqXTISwMH3+M6uraJRoNtNra7/v2RVwcvLzkSkeKwYN6Uq7f/x5r16K6GvPn49QpVFaivBy3bmHLFjg54epVTJ6MjAy5U5LJ494oKdSBA5g1CwA++gh//OOTa2/dwsSJyM6GhwfS08FnOZMErFFSqIkTcewYJk3CN9/o3+C77zBuHADEx8PXty2jkcLwP2FSosJCHD8OAIsXN7jN2LEYORIAdu5so1SkUKxRUqKkJNQcZvn4PG2zCRMA4NSptohEysUaJSW6cQMANBr07Pm0zZydAeD69baIRMrFGiUlKi4GgE6dGtnM2hoASkqg07V6JFIu1igpkaUlgPopog158AAArKx4pZ6k4F8PKZGdHQCUldXuljakoKB+Y6KWYo2SEo0YAQBC4IcfnrZZzVpPz7aIRMrFGiUlGjAA/fsDwFdfNbjNjz/i8GEAmD69jVKRQrFGSYlUKixZAgCffYbTp/VvExKCigp064ZXX23LaKQ8rFFSqKAgeHigogLTp9fudda5fx9Ll2LbNgD45JPa6/VELcWbQUm5cnMxYwayswFg2DB4esLGBlev4sgR3LsHAKtWITxc3oykAKxRUrTiYkRFYeNGFBXVL1Sr4e2N1asxZYp8yUg5WKPUDuh0OH0a167hwQPY2WHIEPziF3JnIuVgjRIRScJLTKRcubl48UWEhOhZdfYsXnwRq1e3eSZSIH4WEynXTz9h504MH65n1Y8/YudO3LnT5plIgbg3SkQkCWuUiEgS1igRkSSsUSIiSVijRESSsEaJiCRhjRIRScIaJSKShDVKRCQJa5SISBLWKBGRJKxRIiJJWKNERJKwRomIJGGNEhFJwholIpKENUpEJAlrlIhIEtYoEZEkrFEiIklYo0REkrBGiYgkYY0SEUnCGiUikoQ1SoqVoVJZW1h4m5v/fNVRlcrawmKWvlVEzcU/I2NUXV198uTJe/fujRkzpnv37nLHMVU6IUorK8urqpq1iqi5WKNG5/79+5MnT75//37v3r2Tk5N37do1ceJEuUMRUYNYo0Zn8+bNVVVVaWlpGo0mIiJi+fLlp06dkjsUETWI50aNzu7du1977TWNRgNgwYIFSUlJBQUFcociogaxRo1Ofn5+3759a753cHDo2LFjfn6+vJGI6ClYo0bHzMxMp9PV/SiEMDMzkzEPET0da9To9O7d++rVqzXfFxQUaLXa3r17yxuJiJ6CNWp05s6du23btvLycgCbN28eP368vb293KGIqEG8Um90fve7323fvn3YsGFOTk4ZGRn79u2TOxERPQ1r1OhYWVklJiampKSUlJR4eXnZ2NjInYiInoY1aozUavWoUaPkTkFETcJzo0REkrBGiYgk4UF9q8jJQWYmOndGQ3fDZ2Xh0iXY2WHMmMZHKypCejoADBiAfv30bJCaijt34OwMZ+eWZyailuHeaKv4z38wZw6WLGlwg08/xZw5ePvtJo126hSmTcO0afjVr6DV6tlg+XJMm4Zt21qYloikYI2akgsXsHat3CGI6HGsUZOh0UCtRkQELl+WOwoRPYI1ajJsbTF/Ph48wNKlckchokewRk3Jn/8MS0t8/TW++kruKET0EGvUlPTpg+XLAWDZMty9K3caIgLACU+tqrISubn6VxUXt3DM5cvx2WfIy8PKldiwoaXJiMhwWKOt6NIl9O9v4DGtrPDJJ5g3D3/7G+bPb9K0UwJw5cqV3/zmNxUVFbNmzXr//ffljkOKwhptRZaWGDZM/6q8PLT4k0HmzsXMmThwAG++iaQk8JnOjVq8ePHmzZtrHoadlpb23XffvfLKK3KHIuVgjbaifv1w8qT+VcuXS5oBun49hg5Faiq2bEFgYMvHaQ+ys7PT09MBdOvWzdvbOykp6ciRI0ePHgVQWVkpczhSBF5iMkkuLggNBYCVK3mtSb/8/Pz33nsPwIMHD5ycnEJCQu7cubNv374LFy6Ehoaam5sDSElJiYqKqqiokDssmTbWqKkKC8OAASgsRESE3FGMTGlp6apVqwYNGrRr165OnTqFh4dfuHDhgw8+qFnbrVu3yMjIM2fOzJo1q7y8PCwsbNiwYfv375c3M5k01qip6tgR69cDwPr1ePSTQw8fxhtvoLBQrlxyEkLExcW5u7uvXr1aq9X6+fllZmauWrWqY8eOT2w5ePDgffv2JSQkuLm5ZWdnz549e9q0aZmZmbLEJlPHGjUuRUU4eRKXLqGqqvGNZ8zA3LkoL8eFC/ULQ0IQG4tBgxAdjXZ16i85OXn8+PH+/v55eXm//OUvT5w4sWPHjj59+tRtcPPmzfPnz2sfebjL1KlTMzIyoqOju3TpcujQoeHDhwcHB9+7d0+O+GTKBLWCtWsFINzcGtwgJEQA4tln65cUF4tXXxVmZgIQgBg4UJw8Wbtq714BiJ499YyTlyesrWtfEh4uhBDZ2cLXt3bJoEEiPt5wv5Wxun79emBgoFqtBtCzZ8/Y2Njq6upHNygpKXn++eednJxGjBjh4OBw/PjxJ0YoKioKCgqq+SBrW1vb6OjoqqqqNvwNyLRxb9RYvP46Dh9GfDzKypCRAZUK/v548KCRVzk54d13H1syaBDi45GQgCFDcOEC5szBtGk4d671gsupoqIiJibG1dV148aN5ubmQUFBWVlZdZVa5+TJk/b29leuXElNTV26dOkf/vCHJ8axtbWNiYlJSkry8fG5ffv2smXLRo0adfz48Tb8VciUyd3jynT4sHjrLbFmTYMb7Nol3npLrFtXvyQtTZw6Vf/j558LoHaH9PRpERgoli/XP1RFhQgOFoGBYs+eJ5dHR4suXQQgLCxEUJD46aeW/0ZGaM+ePf0f3t7g6+ubk5PTlFd98803tra2Tx+238OHY/v6+l65csUwcUm5WKNGav9+AYi9e6WOU1QkgoJqzxV07y6io4UCjlbT0tImPvxcATc3t4MHDzb9tR988MGUKVOevk1paWlkZKS1tTUAKyur0NDQkpISaZFJyVijRio8XKjVIj/fMKOlpooJE2pPmA4fLhITDTNs27t9+3bdSczu3bs39yRmTk6OnZ3dz8+N6nXt2rWAgACVSgWgV69eW7du1el0LQ1OSsYaNUYFBcLOTixYYOBh9+wR/fvXlqmvr2jaQbCxqKioiI6O7tq1KwALC4vAwMDCwsJmjXDx4kUXF5e///3vzXpVYmLiiBEjavZ8fXx80tPzmvVyag9Yo0anpESMGyf69RO3bhl+8LIyERkpOncWgLC0FKGhorjY8O9icAkJCUOGDKnpsqlTp549e7a5I+zevdvR0fFf//pXC969urp669at9vb2PXoM7dJFFxAgbt5swTCkWKxR43Lzphg3TvTsKS5dasV3uX5dBAQIlap2HlVsrHh8gpARyc7O9vX1rSnQgQMHxrdoAtfmzZtVKpW3t3fgQ7dv327uIHfv3l2z5mKHDgIQXbqItWuFVtuCLKRAKiFEm8wIoMalpODFF9G1K/77X8M/Ye/nkpMRHIzvvwcALy/ExMDbu9XftOnu3r0bGRkZHR2t1Wq7du0aFha2bNkyjUbTgqHS09MvXrz46JLZs2dbWVm1YKiLF7FiBeLiAGDAAEREwM+vBcOQssjd41Rr/XrRoYMYOFDs3i0SEmq/WnWfVAih04mtW4WDgwCESiX8/ESeEZz6qzuIBqBWqwMCAm4a2VH0oUNi6NDas8xTpogzZ+QORLJijRoLC4vaf5aPfq1Y0RZvXVwsQkOFRiMA0bmz+OtfT5eXl7fFG+tz9OjRZ555pub/+IkTJ6alpcmV5OkqK0VsrLCzE4AwNxeBgaKZV7xIOVijVOvqVREQIKyti+3tHZ2cnLZu3drGAfLz8+smGPXu3dskJhjdvi2CgoS5ef203MpKuTNRm2ON0mOOHDn/6DXxM21yvFpaWhoeHl7zHCYrK6vw8PCysrI2eF9DOX9ezJhRewDh6ioOHJA7ELUt1ig9qebUZI8ePepOTd5qjblXQgghdDrdjh07+vbtC0ClUvn5+eXm5rbSe7W2PXuEs3P9tNzLl+UORG2FNUr63blzJzQ0tEOHDnj4qGOtoSf4pKSkjBs3rmbP19PTs4k3FxkzrVZERwsbGwGIDh1EUJC4d0/uTNT6WKP0NFlZWbNmzappuppHHRtk2Bs3btQ9h8nR0fHnj7YzaTduiMBAoVYLQDg66pmWu3+/CA0VoaEiK0v/CO+8I0JD6+dpJCSIlSvFpk0NvuOuXWLlStGiewvIAFij1LiEhAR3d/e6E6bnzp1r8VBarTY6OtrGxgZAhw4dgoKC7il0hy0lRYwbV3uM7+kpTpyoXxUaWrvcx0fovYpWM23j8OHaH995RwBi7NgG32vhQgGIF14w6C9ATcbnjVLjpk6dmp6eLv0p8fHx8W5ubsuWLSsuLvb19T1//nxMTExNpSqPpyeOH8eOHejbFz/8AB8f+Pvj6tXHtjl+HFu3ypSPDIc1Sk1iYWERHBx8+fLloKAgnU63bt06FxeXmJiY6urqprw8Kytr5syZzz33XE5Ojqur64EDB+Lj452dnVs7trxUKvj54exZvPMONBrExeHll+vX2tsDQEgIbt+WKyAZBmuUmqEFT4m/c+dOcHCwh4fHwYMHax5td+bMmRkzZrRZZtlZW2PNGmRmYt48vP9+/fJJkzB5MoqKaj8rm0wXa5SabeTIkceOHat5SnxqauqECRPmzJmTm5v7xGZVVVUbN24cPHjwunXrAAQGBmZnZwcHB9d8Rnx7078//v1vTJny2MLISKjV+PRT8PNKTBprlFpozpw5mZmZNU+J37t3r7u7e1hY2P3792vWbtiwYcSIEYsXLy4qKpoyZUpqampsbKydnZ28mY2Nlxd++1sIgSVL2tfHuCpMe9wvIEOxtLQMDQ2dP3/+22+//Y9//CMqKuqLL74YO3bst99+W1BQAGDAgAERERF+fAhSwyIisHMnzp3DRx8hLOxpWxYVYft2/atyclojGjWZ3FMFSCGOHTs2cuTIur8rlUrl7+9v8Bn7ylAz4enll2t/jImpfYp23Y1Peic8NfrFCU9y4d4oGYaPj09ycvKiRYsSExMtLS2//PLLoUOHyh3KNLz5Jj7/HGlpCA5GfHyDmzk6wt9f/6rDh3H2bCulo8axRslg1Gr1li1b5E5heszMEBuLMWOwdy8OHMDMmfo3698f0dH6Vy1axBqVEy8xEcnPywsLFgDAn/7Ea02mhzVKZBSiomBnh/PnsXGj3FGomVijREahe3dERADAe+9Bp5M7DTUHa5QMSQiRkJBwlifqWuT11+HtjR9/RNPusG1ERQXi4xETg+3bUVBggAGpIaxRMph//vOfQ4YMeeGFF2JjY+XOYpLUasTGwiA3eZ0+jUGD8NJL+PRTvPEGBg/G7t0GGJb0Yo2SwTg6Ou7du3fp0qVyBzFhHh544w0DjBMYiD59kJ+PjAzk5cHVFQsXorzcACPTz3HCExnM5MmT5Y5gGl59FZ6ecHLSv3bNGkyYAAB18279/eHujh49Ghxw4UI8+yz69Klf8vXX0GphawsANjZYvBgLFyI7Gw8/dJUMiTVK1NY8PODh0eBaGxs8cffsM880Un+jR2P06CcHeZSZGQBUVDQvJzURD+qJlO9//0P37hg2TO4cCsW9USKFO3YMcXH45BNoNHJHUSjujRIZterq6mvXrpWUlLTs5WfPws8PM2diyRLD5qJ6rFEymLi4OBcXl02bNm3bts3FxSUlJUXuRCbvxIkTrq6uM2bM6Nu37yuvvFLZzBtFDx3CpEkYPhxxcYaZR0V6qYQQcmcghdBqtWVlZXU/du7cuX0+6N6AAgICgoKCvLy8iouLhw4d+uGHH/o39JSnxwmBdevwf/+HBQuwYQMsLFo7abvGv3IyGI1Go+HpN4P64osvar6xtrbWaDRN/ADB6mrMm4fduzFiBJyd8fHHtcsnTcKoUa2UtF1jjRIZtcLCwk2bNiUmJo4ePXrevHlNeYlWi2vX4OkJAHFx9ct79WKNtgrWKJFRMzMz69Spk4ODQ0ZGRmFhYa9evRp9iZUVeF66LfHcKJFp+PWvf+3m5vaXv/xF7iD0JF6pJzJe33//fVVVVc33arW6gvchGSXujRIZKSHEc889V1hYOH369Nzc3H379n377beurq5y56InsUaJjJdOpzt48OCZM2e6du36/PPPOzg4yJ2I9GCNEhFJwnOjRESSsEaJiCRhjRIRScIaJSKShDVKRCQJa5SISJL/BzbF4DDODC0EAAAAjXpUWHRyZGtpdFBLTCByZGtpdCAyMDI1LjAzLjIAAHice79v7T0GIBAAYiYGCGAFYhYgbmBkZ0gA0oxMbGCaCUhrAGlmFnaGDBDNyMQBEWDiZmAEKmNgYlZgYtFgcgKZIu4G0soAM7ORlfUAkFYDcZw63PczMDjsB7GB4vZAaimIvejkC3tU8QMgOQYxAOnJEbARCjhtAAAAynpUWHRNT0wgcmRraXQgMjAyNS4wMy4yAAB4nH1Qyw6DIBC88xXzA5LlpXJUMU3TiElr+w+99//TxYaiB92FZHYZhh0EUtzD7f3BP3QQAqCT5b3HyxCRmJAA+vFyjRiWrs+dYX7G5QEHyzc498xumafcUYiotKSmtbUCSVpjAzJRY0BFsmm81WZF7oBpmEnSKV8rl4iHkpbfVrJVbe3OFR3moqikOxIcY9iZ+9nt5xiK3ZS6mOICpkyueNsyH5/BbdW3WqnOn85YfAFmTVfNGpJLJQAAAIB6VFh0U01JTEVTIHJka2l0IDIwMjUuMDMuMgAAeJxlyT0KgDAMQOGrOCrEkLRNfyJO7tUrODhKRRw9vHTq4PY+Xl6WPg/z2r39aJBCdB4IppEwhOSMhVoCE6Fw8ix1MkaOXn6LUWCA/SnndpdLCWvm8hxIyg2spsGobbDqGtz7AVEZKbCEUZzQAAAAAElFTkSuQmCC",
            "image/svg+xml": [
              "<?xml version='1.0' encoding='iso-8859-1'?>\n",
              "<svg version='1.1' baseProfile='full'\n",
              "              xmlns='http://www.w3.org/2000/svg'\n",
              "                      xmlns:rdkit='http://www.rdkit.org/xml'\n",
              "                      xmlns:xlink='http://www.w3.org/1999/xlink'\n",
              "                  xml:space='preserve'\n",
              "width='450px' height='150px' viewBox='0 0 450 150'>\n",
              "<!-- END OF HEADER -->\n",
              "<rect style='opacity:1.0;fill:#FFFFFF;stroke:none' width='450.0' height='150.0' x='0.0' y='0.0'> </rect>\n",
              "<path class='bond-0 atom-0 atom-1' d='M 155.5,96.0 L 177.1,108.4' style='fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-0 atom-0 atom-1' d='M 177.1,108.4 L 198.7,120.9' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-1 atom-1 atom-2' d='M 198.7,120.9 L 250.3,91.1' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-2 atom-2 atom-3' d='M 250.3,91.1 L 271.9,103.6' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-2 atom-2 atom-3' d='M 271.9,103.6 L 293.5,116.1' style='fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 254.7,93.7 L 254.7,68.0' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 254.7,68.0 L 254.7,42.3' style='fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 245.8,93.7 L 245.8,68.0' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 245.8,68.0 L 245.8,42.3' style='fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path d='M 197.6,120.3 L 198.7,120.9 L 201.3,119.4' style='fill:none;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;' />\n",
              "<path d='M 247.7,92.6 L 250.3,91.1 L 251.4,91.8' style='fill:none;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;' />\n",
              "<path class='atom-0' d='M 114.5 82.7\n",
              "L 116.8 82.7\n",
              "L 116.8 89.9\n",
              "L 125.5 89.9\n",
              "L 125.5 82.7\n",
              "L 127.7 82.7\n",
              "L 127.7 99.6\n",
              "L 125.5 99.6\n",
              "L 125.5 91.8\n",
              "L 116.8 91.8\n",
              "L 116.8 99.6\n",
              "L 114.5 99.6\n",
              "L 114.5 82.7\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-0' d='M 131.0 99.0\n",
              "Q 131.5 97.9, 132.4 97.3\n",
              "Q 133.4 96.7, 134.8 96.7\n",
              "Q 136.4 96.7, 137.4 97.7\n",
              "Q 138.3 98.6, 138.3 100.2\n",
              "Q 138.3 101.8, 137.1 103.4\n",
              "Q 135.9 104.9, 133.4 106.7\n",
              "L 138.5 106.7\n",
              "L 138.5 108.0\n",
              "L 131.0 108.0\n",
              "L 131.0 107.0\n",
              "Q 133.1 105.5, 134.3 104.4\n",
              "Q 135.6 103.3, 136.2 102.3\n",
              "Q 136.8 101.3, 136.8 100.3\n",
              "Q 136.8 99.2, 136.2 98.6\n",
              "Q 135.7 98.0, 134.8 98.0\n",
              "Q 133.9 98.0, 133.3 98.4\n",
              "Q 132.7 98.7, 132.2 99.5\n",
              "L 131.0 99.0\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-0' d='M 143.4 82.7\n",
              "L 148.9 91.6\n",
              "Q 149.5 92.5, 150.3 94.1\n",
              "Q 151.2 95.7, 151.3 95.8\n",
              "L 151.3 82.7\n",
              "L 153.5 82.7\n",
              "L 153.5 99.6\n",
              "L 151.2 99.6\n",
              "L 145.3 89.8\n",
              "Q 144.6 88.7, 143.8 87.3\n",
              "Q 143.1 86.0, 142.9 85.6\n",
              "L 142.9 99.6\n",
              "L 140.7 99.6\n",
              "L 140.7 82.7\n",
              "L 143.4 82.7\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-3' d='M 298.1 112.5\n",
              "L 303.7 121.4\n",
              "Q 304.2 122.3, 305.1 123.9\n",
              "Q 306.0 125.5, 306.0 125.6\n",
              "L 306.0 112.5\n",
              "L 308.3 112.5\n",
              "L 308.3 129.4\n",
              "L 305.9 129.4\n",
              "L 300.0 119.6\n",
              "Q 299.3 118.4, 298.6 117.1\n",
              "Q 297.9 115.8, 297.7 115.4\n",
              "L 297.7 129.4\n",
              "L 295.5 129.4\n",
              "L 295.5 112.5\n",
              "L 298.1 112.5\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-3' d='M 311.5 112.5\n",
              "L 313.8 112.5\n",
              "L 313.8 119.7\n",
              "L 322.4 119.7\n",
              "L 322.4 112.5\n",
              "L 324.7 112.5\n",
              "L 324.7 129.4\n",
              "L 322.4 129.4\n",
              "L 322.4 121.6\n",
              "L 313.8 121.6\n",
              "L 313.8 129.4\n",
              "L 311.5 129.4\n",
              "L 311.5 112.5\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-3' d='M 328.0 128.8\n",
              "Q 328.4 127.7, 329.4 127.1\n",
              "Q 330.3 126.5, 331.7 126.5\n",
              "Q 333.4 126.5, 334.3 127.4\n",
              "Q 335.3 128.4, 335.3 130.0\n",
              "Q 335.3 131.6, 334.0 133.2\n",
              "Q 332.8 134.7, 330.3 136.5\n",
              "L 335.5 136.5\n",
              "L 335.5 137.8\n",
              "L 328.0 137.8\n",
              "L 328.0 136.7\n",
              "Q 330.0 135.3, 331.3 134.2\n",
              "Q 332.5 133.1, 333.1 132.1\n",
              "Q 333.7 131.1, 333.7 130.1\n",
              "Q 333.7 129.0, 333.2 128.4\n",
              "Q 332.6 127.8, 331.7 127.8\n",
              "Q 330.8 127.8, 330.2 128.2\n",
              "Q 329.6 128.5, 329.2 129.3\n",
              "L 328.0 128.8\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-4' d='M 242.5 31.6\n",
              "Q 242.5 27.6, 244.5 25.3\n",
              "Q 246.5 23.0, 250.3 23.0\n",
              "Q 254.0 23.0, 256.0 25.3\n",
              "Q 258.0 27.6, 258.0 31.6\n",
              "Q 258.0 35.7, 256.0 38.1\n",
              "Q 254.0 40.4, 250.3 40.4\n",
              "Q 246.6 40.4, 244.5 38.1\n",
              "Q 242.5 35.7, 242.5 31.6\n",
              "M 250.3 38.5\n",
              "Q 252.8 38.5, 254.2 36.7\n",
              "Q 255.6 35.0, 255.6 31.6\n",
              "Q 255.6 28.3, 254.2 26.6\n",
              "Q 252.8 24.9, 250.3 24.9\n",
              "Q 247.7 24.9, 246.3 26.6\n",
              "Q 244.9 28.3, 244.9 31.6\n",
              "Q 244.9 35.0, 246.3 36.7\n",
              "Q 247.7 38.5, 250.3 38.5\n",
              "' fill='#FF0000'/>\n",
              "<path class='note' d='M 137.2 78.3\n",
              "Q 135.6 78.3, 134.9 77.2\n",
              "Q 134.1 76.0, 134.1 74.0\n",
              "Q 134.1 71.9, 134.9 70.8\n",
              "Q 135.6 69.7, 137.2 69.7\n",
              "Q 138.7 69.7, 139.5 70.8\n",
              "Q 140.3 71.9, 140.3 74.0\n",
              "Q 140.3 76.0, 139.5 77.2\n",
              "Q 138.7 78.3, 137.2 78.3\n",
              "M 137.2 77.4\n",
              "Q 138.1 77.4, 138.6 76.5\n",
              "Q 139.1 75.6, 139.1 74.0\n",
              "Q 139.1 72.3, 138.6 71.5\n",
              "Q 138.1 70.6, 137.2 70.6\n",
              "Q 136.3 70.6, 135.8 71.5\n",
              "Q 135.3 72.3, 135.3 74.0\n",
              "Q 135.3 75.6, 135.8 76.5\n",
              "Q 136.3 77.4, 137.2 77.4\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 196.6 134.1\n",
              "L 198.5 134.1\n",
              "L 198.5 127.8\n",
              "L 196.4 128.5\n",
              "L 196.1 127.8\n",
              "L 198.7 126.6\n",
              "L 199.6 126.8\n",
              "L 199.6 134.1\n",
              "L 201.2 134.1\n",
              "L 201.2 135.1\n",
              "L 196.6 135.1\n",
              "L 196.6 134.1\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 247.5 98.5\n",
              "Q 247.8 97.7, 248.5 97.3\n",
              "Q 249.2 96.8, 250.3 96.8\n",
              "Q 251.5 96.8, 252.3 97.5\n",
              "Q 253.0 98.2, 253.0 99.4\n",
              "Q 253.0 100.7, 252.0 101.8\n",
              "Q 251.1 103.0, 249.2 104.4\n",
              "L 253.1 104.4\n",
              "L 253.1 105.3\n",
              "L 247.4 105.3\n",
              "L 247.4 104.5\n",
              "Q 249.0 103.4, 249.9 102.6\n",
              "Q 250.9 101.7, 251.3 101.0\n",
              "Q 251.8 100.2, 251.8 99.5\n",
              "Q 251.8 98.7, 251.4 98.2\n",
              "Q 251.0 97.8, 250.3 97.8\n",
              "Q 249.6 97.8, 249.1 98.0\n",
              "Q 248.7 98.3, 248.4 98.9\n",
              "L 247.5 98.5\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 313.0 137.9\n",
              "Q 313.8 138.2, 314.2 138.7\n",
              "Q 314.6 139.2, 314.6 140.1\n",
              "Q 314.6 140.8, 314.3 141.3\n",
              "Q 313.9 141.9, 313.3 142.2\n",
              "Q 312.6 142.5, 311.7 142.5\n",
              "Q 310.8 142.5, 310.2 142.2\n",
              "Q 309.5 141.9, 309.0 141.2\n",
              "L 309.6 140.5\n",
              "Q 310.2 141.1, 310.6 141.3\n",
              "Q 311.0 141.5, 311.7 141.5\n",
              "Q 312.5 141.5, 313.0 141.1\n",
              "Q 313.4 140.7, 313.4 140.1\n",
              "Q 313.4 139.2, 312.9 138.8\n",
              "Q 312.5 138.4, 311.4 138.4\n",
              "L 310.8 138.4\n",
              "L 310.8 137.6\n",
              "L 311.4 137.6\n",
              "Q 312.3 137.6, 312.8 137.2\n",
              "Q 313.3 136.8, 313.3 136.0\n",
              "Q 313.3 135.5, 312.8 135.1\n",
              "Q 312.4 134.8, 311.8 134.8\n",
              "Q 311.0 134.8, 310.6 135.1\n",
              "Q 310.2 135.3, 309.8 135.9\n",
              "L 309.0 135.5\n",
              "Q 309.3 134.8, 310.0 134.3\n",
              "Q 310.8 133.9, 311.8 133.9\n",
              "Q 313.0 133.9, 313.7 134.4\n",
              "Q 314.4 135.0, 314.4 136.0\n",
              "Q 314.4 136.7, 314.1 137.2\n",
              "Q 313.7 137.7, 313.0 137.9\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 252.5 13.0\n",
              "L 253.5 13.0\n",
              "L 253.5 14.0\n",
              "L 252.5 14.0\n",
              "L 252.5 15.9\n",
              "L 251.4 15.9\n",
              "L 251.4 14.0\n",
              "L 247.0 14.0\n",
              "L 247.0 13.2\n",
              "L 250.7 7.5\n",
              "L 252.5 7.5\n",
              "L 252.5 13.0\n",
              "M 248.4 13.0\n",
              "L 251.4 13.0\n",
              "L 251.4 8.3\n",
              "L 248.4 13.0\n",
              "' fill='#000000'/>\n",
              "</svg>\n"
            ],
            "text/html": [
              "<?xml version='1.0' encoding='iso-8859-1'?>\n",
              "<svg version='1.1' baseProfile='full'\n",
              "              xmlns='http://www.w3.org/2000/svg'\n",
              "                      xmlns:rdkit='http://www.rdkit.org/xml'\n",
              "                      xmlns:xlink='http://www.w3.org/1999/xlink'\n",
              "                  xml:space='preserve'\n",
              "width='450px' height='150px' viewBox='0 0 450 150'>\n",
              "<!-- END OF HEADER -->\n",
              "<rect style='opacity:1.0;fill:#FFFFFF;stroke:none' width='450.0' height='150.0' x='0.0' y='0.0'> </rect>\n",
              "<path class='bond-0 atom-0 atom-1' d='M 155.5,96.0 L 177.1,108.4' style='fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-0 atom-0 atom-1' d='M 177.1,108.4 L 198.7,120.9' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-1 atom-1 atom-2' d='M 198.7,120.9 L 250.3,91.1' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-2 atom-2 atom-3' d='M 250.3,91.1 L 271.9,103.6' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-2 atom-2 atom-3' d='M 271.9,103.6 L 293.5,116.1' style='fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 254.7,93.7 L 254.7,68.0' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 254.7,68.0 L 254.7,42.3' style='fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 245.8,93.7 L 245.8,68.0' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path class='bond-3 atom-2 atom-4' d='M 245.8,68.0 L 245.8,42.3' style='fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />\n",
              "<path d='M 197.6,120.3 L 198.7,120.9 L 201.3,119.4' style='fill:none;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;' />\n",
              "<path d='M 247.7,92.6 L 250.3,91.1 L 251.4,91.8' style='fill:none;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;' />\n",
              "<path class='atom-0' d='M 114.5 82.7\n",
              "L 116.8 82.7\n",
              "L 116.8 89.9\n",
              "L 125.5 89.9\n",
              "L 125.5 82.7\n",
              "L 127.7 82.7\n",
              "L 127.7 99.6\n",
              "L 125.5 99.6\n",
              "L 125.5 91.8\n",
              "L 116.8 91.8\n",
              "L 116.8 99.6\n",
              "L 114.5 99.6\n",
              "L 114.5 82.7\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-0' d='M 131.0 99.0\n",
              "Q 131.5 97.9, 132.4 97.3\n",
              "Q 133.4 96.7, 134.8 96.7\n",
              "Q 136.4 96.7, 137.4 97.7\n",
              "Q 138.3 98.6, 138.3 100.2\n",
              "Q 138.3 101.8, 137.1 103.4\n",
              "Q 135.9 104.9, 133.4 106.7\n",
              "L 138.5 106.7\n",
              "L 138.5 108.0\n",
              "L 131.0 108.0\n",
              "L 131.0 107.0\n",
              "Q 133.1 105.5, 134.3 104.4\n",
              "Q 135.6 103.3, 136.2 102.3\n",
              "Q 136.8 101.3, 136.8 100.3\n",
              "Q 136.8 99.2, 136.2 98.6\n",
              "Q 135.7 98.0, 134.8 98.0\n",
              "Q 133.9 98.0, 133.3 98.4\n",
              "Q 132.7 98.7, 132.2 99.5\n",
              "L 131.0 99.0\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-0' d='M 143.4 82.7\n",
              "L 148.9 91.6\n",
              "Q 149.5 92.5, 150.3 94.1\n",
              "Q 151.2 95.7, 151.3 95.8\n",
              "L 151.3 82.7\n",
              "L 153.5 82.7\n",
              "L 153.5 99.6\n",
              "L 151.2 99.6\n",
              "L 145.3 89.8\n",
              "Q 144.6 88.7, 143.8 87.3\n",
              "Q 143.1 86.0, 142.9 85.6\n",
              "L 142.9 99.6\n",
              "L 140.7 99.6\n",
              "L 140.7 82.7\n",
              "L 143.4 82.7\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-3' d='M 298.1 112.5\n",
              "L 303.7 121.4\n",
              "Q 304.2 122.3, 305.1 123.9\n",
              "Q 306.0 125.5, 306.0 125.6\n",
              "L 306.0 112.5\n",
              "L 308.3 112.5\n",
              "L 308.3 129.4\n",
              "L 305.9 129.4\n",
              "L 300.0 119.6\n",
              "Q 299.3 118.4, 298.6 117.1\n",
              "Q 297.9 115.8, 297.7 115.4\n",
              "L 297.7 129.4\n",
              "L 295.5 129.4\n",
              "L 295.5 112.5\n",
              "L 298.1 112.5\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-3' d='M 311.5 112.5\n",
              "L 313.8 112.5\n",
              "L 313.8 119.7\n",
              "L 322.4 119.7\n",
              "L 322.4 112.5\n",
              "L 324.7 112.5\n",
              "L 324.7 129.4\n",
              "L 322.4 129.4\n",
              "L 322.4 121.6\n",
              "L 313.8 121.6\n",
              "L 313.8 129.4\n",
              "L 311.5 129.4\n",
              "L 311.5 112.5\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-3' d='M 328.0 128.8\n",
              "Q 328.4 127.7, 329.4 127.1\n",
              "Q 330.3 126.5, 331.7 126.5\n",
              "Q 333.4 126.5, 334.3 127.4\n",
              "Q 335.3 128.4, 335.3 130.0\n",
              "Q 335.3 131.6, 334.0 133.2\n",
              "Q 332.8 134.7, 330.3 136.5\n",
              "L 335.5 136.5\n",
              "L 335.5 137.8\n",
              "L 328.0 137.8\n",
              "L 328.0 136.7\n",
              "Q 330.0 135.3, 331.3 134.2\n",
              "Q 332.5 133.1, 333.1 132.1\n",
              "Q 333.7 131.1, 333.7 130.1\n",
              "Q 333.7 129.0, 333.2 128.4\n",
              "Q 332.6 127.8, 331.7 127.8\n",
              "Q 330.8 127.8, 330.2 128.2\n",
              "Q 329.6 128.5, 329.2 129.3\n",
              "L 328.0 128.8\n",
              "' fill='#0000FF'/>\n",
              "<path class='atom-4' d='M 242.5 31.6\n",
              "Q 242.5 27.6, 244.5 25.3\n",
              "Q 246.5 23.0, 250.3 23.0\n",
              "Q 254.0 23.0, 256.0 25.3\n",
              "Q 258.0 27.6, 258.0 31.6\n",
              "Q 258.0 35.7, 256.0 38.1\n",
              "Q 254.0 40.4, 250.3 40.4\n",
              "Q 246.6 40.4, 244.5 38.1\n",
              "Q 242.5 35.7, 242.5 31.6\n",
              "M 250.3 38.5\n",
              "Q 252.8 38.5, 254.2 36.7\n",
              "Q 255.6 35.0, 255.6 31.6\n",
              "Q 255.6 28.3, 254.2 26.6\n",
              "Q 252.8 24.9, 250.3 24.9\n",
              "Q 247.7 24.9, 246.3 26.6\n",
              "Q 244.9 28.3, 244.9 31.6\n",
              "Q 244.9 35.0, 246.3 36.7\n",
              "Q 247.7 38.5, 250.3 38.5\n",
              "' fill='#FF0000'/>\n",
              "<path class='note' d='M 137.2 78.3\n",
              "Q 135.6 78.3, 134.9 77.2\n",
              "Q 134.1 76.0, 134.1 74.0\n",
              "Q 134.1 71.9, 134.9 70.8\n",
              "Q 135.6 69.7, 137.2 69.7\n",
              "Q 138.7 69.7, 139.5 70.8\n",
              "Q 140.3 71.9, 140.3 74.0\n",
              "Q 140.3 76.0, 139.5 77.2\n",
              "Q 138.7 78.3, 137.2 78.3\n",
              "M 137.2 77.4\n",
              "Q 138.1 77.4, 138.6 76.5\n",
              "Q 139.1 75.6, 139.1 74.0\n",
              "Q 139.1 72.3, 138.6 71.5\n",
              "Q 138.1 70.6, 137.2 70.6\n",
              "Q 136.3 70.6, 135.8 71.5\n",
              "Q 135.3 72.3, 135.3 74.0\n",
              "Q 135.3 75.6, 135.8 76.5\n",
              "Q 136.3 77.4, 137.2 77.4\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 196.6 134.1\n",
              "L 198.5 134.1\n",
              "L 198.5 127.8\n",
              "L 196.4 128.5\n",
              "L 196.1 127.8\n",
              "L 198.7 126.6\n",
              "L 199.6 126.8\n",
              "L 199.6 134.1\n",
              "L 201.2 134.1\n",
              "L 201.2 135.1\n",
              "L 196.6 135.1\n",
              "L 196.6 134.1\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 247.5 98.5\n",
              "Q 247.8 97.7, 248.5 97.3\n",
              "Q 249.2 96.8, 250.3 96.8\n",
              "Q 251.5 96.8, 252.3 97.5\n",
              "Q 253.0 98.2, 253.0 99.4\n",
              "Q 253.0 100.7, 252.0 101.8\n",
              "Q 251.1 103.0, 249.2 104.4\n",
              "L 253.1 104.4\n",
              "L 253.1 105.3\n",
              "L 247.4 105.3\n",
              "L 247.4 104.5\n",
              "Q 249.0 103.4, 249.9 102.6\n",
              "Q 250.9 101.7, 251.3 101.0\n",
              "Q 251.8 100.2, 251.8 99.5\n",
              "Q 251.8 98.7, 251.4 98.2\n",
              "Q 251.0 97.8, 250.3 97.8\n",
              "Q 249.6 97.8, 249.1 98.0\n",
              "Q 248.7 98.3, 248.4 98.9\n",
              "L 247.5 98.5\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 313.0 137.9\n",
              "Q 313.8 138.2, 314.2 138.7\n",
              "Q 314.6 139.2, 314.6 140.1\n",
              "Q 314.6 140.8, 314.3 141.3\n",
              "Q 313.9 141.9, 313.3 142.2\n",
              "Q 312.6 142.5, 311.7 142.5\n",
              "Q 310.8 142.5, 310.2 142.2\n",
              "Q 309.5 141.9, 309.0 141.2\n",
              "L 309.6 140.5\n",
              "Q 310.2 141.1, 310.6 141.3\n",
              "Q 311.0 141.5, 311.7 141.5\n",
              "Q 312.5 141.5, 313.0 141.1\n",
              "Q 313.4 140.7, 313.4 140.1\n",
              "Q 313.4 139.2, 312.9 138.8\n",
              "Q 312.5 138.4, 311.4 138.4\n",
              "L 310.8 138.4\n",
              "L 310.8 137.6\n",
              "L 311.4 137.6\n",
              "Q 312.3 137.6, 312.8 137.2\n",
              "Q 313.3 136.8, 313.3 136.0\n",
              "Q 313.3 135.5, 312.8 135.1\n",
              "Q 312.4 134.8, 311.8 134.8\n",
              "Q 311.0 134.8, 310.6 135.1\n",
              "Q 310.2 135.3, 309.8 135.9\n",
              "L 309.0 135.5\n",
              "Q 309.3 134.8, 310.0 134.3\n",
              "Q 310.8 133.9, 311.8 133.9\n",
              "Q 313.0 133.9, 313.7 134.4\n",
              "Q 314.4 135.0, 314.4 136.0\n",
              "Q 314.4 136.7, 314.1 137.2\n",
              "Q 313.7 137.7, 313.0 137.9\n",
              "' fill='#000000'/>\n",
              "<path class='note' d='M 252.5 13.0\n",
              "L 253.5 13.0\n",
              "L 253.5 14.0\n",
              "L 252.5 14.0\n",
              "L 252.5 15.9\n",
              "L 251.4 15.9\n",
              "L 251.4 14.0\n",
              "L 247.0 14.0\n",
              "L 247.0 13.2\n",
              "L 250.7 7.5\n",
              "L 252.5 7.5\n",
              "L 252.5 13.0\n",
              "M 248.4 13.0\n",
              "L 251.4 13.0\n",
              "L 251.4 8.3\n",
              "L 248.4 13.0\n",
              "' fill='#000000'/>\n",
              "</svg>\n"
            ],
            "text/plain": [
              "<rdkit.Chem.rdchem.Mol at 0x11fc810e0>"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "core_mol = Chem.MolFromSmiles(core_smiles) # Convert smile strings to rdkit molecule\n",
        "core_num_atoms = core_mol.GetNumAtoms() # Get the atom numbers\n",
        "show_atom_number(core_mol)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ABPFxKGwa6IK"
      },
      "source": [
        "-  **Define the path to the .csv file for all components, which is the csv file uploaded**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "id": "j31nfDulbAMM"
      },
      "outputs": [],
      "source": [
        "data_file = Path(\"/Users/yuexu/Documents/GitHub/Agiledataset/AGILE_1200_SMILES_Compontents.csv\") # Please change this to your path of the .csv file for all components, the format of the file need to follow some requirments"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "IUdXBBJdbJ2i",
        "outputId": "f815a168-19fd-4823-ddaf-259f8d46d0b4"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>ID</th>\n",
              "      <th>Name</th>\n",
              "      <th>SMILES</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>A1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CN(C)CCN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>A2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCCN1CCCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>A3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CC)CCCN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>A4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CCN)CC</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>A5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN(CCN)CCN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>A6</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CN(C)CCCN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>A7</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCCN1CCOCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>A8</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCN(CCCC)CCCN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>A9</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CC1=NN(C)C(N)=C1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>A10</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN1CCCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>10</th>\n",
              "      <td>A11</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NN1CCOCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>11</th>\n",
              "      <td>A12</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CN1CCC(N)CC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>12</th>\n",
              "      <td>A13</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NN1CCCCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>13</th>\n",
              "      <td>A14</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NC1=CNN=C1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>14</th>\n",
              "      <td>A15</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN1CCCCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>15</th>\n",
              "      <td>A16</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CC(N(C(C)C)CCN)C</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>16</th>\n",
              "      <td>A17</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCCN(CCCN)C</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>17</th>\n",
              "      <td>A18</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN1C(CN)CCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>18</th>\n",
              "      <td>A19</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN1CCNCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>19</th>\n",
              "      <td>A20</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NC1CCNCC1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20</th>\n",
              "      <td>B1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCC)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>21</th>\n",
              "      <td>B2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CC(CCC(OCCCCCC=O)=O)CCCCC</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>22</th>\n",
              "      <td>B3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCCC)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>23</th>\n",
              "      <td>B4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(/C=C/CCCCCCC)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>24</th>\n",
              "      <td>B5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(/C=C\\CCCCCCC)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>25</th>\n",
              "      <td>B6</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCCCC)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>26</th>\n",
              "      <td>B7</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCCC#C)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>27</th>\n",
              "      <td>B8</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCCC=C)OCCCCCC=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>28</th>\n",
              "      <td>B9</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCCCC(OCCCCCC=O)=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>29</th>\n",
              "      <td>B10</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCCCCCC(OCCCCCC=O)=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>30</th>\n",
              "      <td>B11</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCC/C=C\\CCCCCCCC(OCCCCCC=O)=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>31</th>\n",
              "      <td>B12</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCC/C=C\\C/C=C\\CCCCCCCC(OCCCCCC=O)=O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>32</th>\n",
              "      <td>C1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>33</th>\n",
              "      <td>C2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>34</th>\n",
              "      <td>C3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>35</th>\n",
              "      <td>C4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>36</th>\n",
              "      <td>C5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCC/C=C\\CCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "     ID  Name                                 SMILES\n",
              "0    A1   NaN                               CN(C)CCN\n",
              "1    A2   NaN                            NCCCN1CCCC1\n",
              "2    A3   NaN                            CCN(CC)CCCN\n",
              "3    A4   NaN                             CCN(CCN)CC\n",
              "4    A5   NaN                           NCCN(CCN)CCN\n",
              "5    A6   NaN                              CN(C)CCCN\n",
              "6    A7   NaN                           NCCCN1CCOCC1\n",
              "7    A8   NaN                        CCCCN(CCCC)CCCN\n",
              "8    A9   NaN                       CC1=NN(C)C(N)=C1\n",
              "9   A10   NaN                             NCCN1CCCC1\n",
              "10  A11   NaN                              NN1CCOCC1\n",
              "11  A12   NaN                           CN1CCC(N)CC1\n",
              "12  A13   NaN                              NN1CCCCC1\n",
              "13  A14   NaN                             NC1=CNN=C1\n",
              "14  A15   NaN                            NCCN1CCCCC1\n",
              "15  A16   NaN                       CC(N(C(C)C)CCN)C\n",
              "16  A17   NaN                           NCCCN(CCCN)C\n",
              "17  A18   NaN                          CCN1C(CN)CCC1\n",
              "18  A19   NaN                            NCCN1CCNCC1\n",
              "19  A20   NaN                              NC1CCNCC1\n",
              "20   B1   NaN                 O=C(CCCCCCCC)OCCCCCC=O\n",
              "21   B2   NaN              CC(CCC(OCCCCCC=O)=O)CCCCC\n",
              "22   B3   NaN                O=C(CCCCCCCCC)OCCCCCC=O\n",
              "23   B4   NaN             O=C(/C=C/CCCCCCC)OCCCCCC=O\n",
              "24   B5   NaN             O=C(/C=C\\CCCCCCC)OCCCCCC=O\n",
              "25   B6   NaN               O=C(CCCCCCCCCC)OCCCCCC=O\n",
              "26   B7   NaN              O=C(CCCCCCCCC#C)OCCCCCC=O\n",
              "27   B8   NaN              O=C(CCCCCCCCC=C)OCCCCCC=O\n",
              "28   B9   NaN          CCCCCCCCCCCCCCCC(OCCCCCC=O)=O\n",
              "29  B10   NaN        CCCCCCCCCCCCCCCCCC(OCCCCCC=O)=O\n",
              "30  B11   NaN     CCCCCCCC/C=C\\CCCCCCCC(OCCCCCC=O)=O\n",
              "31  B12   NaN  CCCCC/C=C\\C/C=C\\CCCCCCCC(OCCCCCC=O)=O\n",
              "32   C1   NaN                  CCCCCCCCCCCC[N+]#[C-]\n",
              "33   C2   NaN                CCCCCCCCCCCCCC[N+]#[C-]\n",
              "34   C3   NaN              CCCCCCCCCCCCCCCC[N+]#[C-]\n",
              "35   C4   NaN            CCCCCCCCCCCCCCCCCC[N+]#[C-]\n",
              "36   C5   NaN         CCCCCCCC/C=C\\CCCCCCCC[N+]#[C-]"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# show the dataframe read in\n",
        "mol_df = pd.read_csv(data_file)\n",
        "\n",
        "mol_df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2IZXchH4y3WJ"
      },
      "source": [
        "- Calculate the Molecular weight of each molecule"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "id": "lQnUHJ4lzf09",
        "outputId": "83b8d10f-c850-4d44-b57e-a13f979e93b7"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>ID</th>\n",
              "      <th>Name</th>\n",
              "      <th>SMILES</th>\n",
              "      <th>mol</th>\n",
              "      <th>MW</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>A1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CN(C)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>88.154</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>A2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCCN1CCCC1</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>128.219</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>A3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CC)CCCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>130.235</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>A4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CCN)CC</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>116.208</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>A5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN(CCN)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>146.238</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "   ID  Name        SMILES                                            mol  \\\n",
              "0  A1   NaN      CN(C)CCN  <rdkit.Chem.rdchem.Mol object at 0x11fc818c0>   \n",
              "1  A2   NaN   NCCCN1CCCC1  <rdkit.Chem.rdchem.Mol object at 0x11fc819a0>   \n",
              "2  A3   NaN   CCN(CC)CCCN  <rdkit.Chem.rdchem.Mol object at 0x11fc81a10>   \n",
              "3  A4   NaN    CCN(CCN)CC  <rdkit.Chem.rdchem.Mol object at 0x11fc81a80>   \n",
              "4  A5   NaN  NCCN(CCN)CCN  <rdkit.Chem.rdchem.Mol object at 0x11fc81af0>   \n",
              "\n",
              "        MW  \n",
              "0   88.154  \n",
              "1  128.219  \n",
              "2  130.235  \n",
              "3  116.208  \n",
              "4  146.238  "
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "assert len(mol_df.dropna(subset=[\"SMILES\"]).drop_duplicates(subset=[\"SMILES\"])) == len(mol_df)\n",
        "\n",
        "# add a column to store the molecule object\n",
        "mol_df[\"mol\"] = mol_df[\"SMILES\"].apply(Chem.MolFromSmiles).apply(show_atom_number)\n",
        "\n",
        "# add a column to store the molecular weight\n",
        "mol_df[\"MW\"] = mol_df[\"mol\"].apply(Descriptors.MolWt)\n",
        "\n",
        "mol_df.head()\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "id": "eprjZhY8zf0-",
        "outputId": "29cf368c-131a-4d72-b494-d90d54bb4435"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>ID</th>\n",
              "      <th>Name</th>\n",
              "      <th>SMILES</th>\n",
              "      <th>mol</th>\n",
              "      <th>MW</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>A1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CN(C)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>88.154</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>A2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCCN1CCCC1</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>128.219</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>A3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CC)CCCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>130.235</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>A4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CCN)CC</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>116.208</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>A5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN(CCN)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>146.238</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "   ID  Name        SMILES                                            mol  \\\n",
              "0  A1   NaN      CN(C)CCN  <rdkit.Chem.rdchem.Mol object at 0x11fc818c0>   \n",
              "1  A2   NaN   NCCCN1CCCC1  <rdkit.Chem.rdchem.Mol object at 0x11fc819a0>   \n",
              "2  A3   NaN   CCN(CC)CCCN  <rdkit.Chem.rdchem.Mol object at 0x11fc81a10>   \n",
              "3  A4   NaN    CCN(CCN)CC  <rdkit.Chem.rdchem.Mol object at 0x11fc81a80>   \n",
              "4  A5   NaN  NCCN(CCN)CCN  <rdkit.Chem.rdchem.Mol object at 0x11fc81af0>   \n",
              "\n",
              "        MW  \n",
              "0   88.154  \n",
              "1  128.219  \n",
              "2  130.235  \n",
              "3  116.208  \n",
              "4  146.238  "
            ]
          },
          "execution_count": 14,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# split the dataframe into four parts based on ID, (1) A*, (2) B*, (3) C*, (4) D*\n",
        "mol_df[\"ID\"] = mol_df[\"ID\"].str.strip()\n",
        "mol_df_A = mol_df[mol_df[\"ID\"].str.startswith(\"A\")]\n",
        "mol_df_B = mol_df[mol_df[\"ID\"].str.startswith(\"B\")]\n",
        "mol_df_C = mol_df[mol_df[\"ID\"].str.startswith(\"C\")]\n",
        "mol_df_D = mol_df[mol_df[\"ID\"].str.startswith(\"D\")]\n",
        "\n",
        "mol_df_A.head()\n",
        "\n",
        "# After running this, you will have four dataframes that storing each conponents individually"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "V9XESqixzf0-"
      },
      "source": [
        "### 2.1. Structurize the df of component A, R1-NH2, break the functional group for the assemble"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {
        "id": "UbUR3EV2zf0-"
      },
      "outputs": [],
      "source": [
        "# extract the -NH2 group from mol_df_A molecules, store the index for them\n",
        "frag_NH2 = Chem.MolFromSmarts(\"[NH2]\")\n",
        "mol_df_A[\"NH2_pos\"] = mol_df_A.apply(\n",
        "    lambda x: x[\"mol\"].GetSubstructMatch(frag_NH2), axis=1\n",
        ") # only return the first index\n",
        "\n",
        "\n",
        "# replace the NH2 group with an open end atom *, that can later be connected to other molecules\n",
        "mol_df_A[\"main_compoent\"] = mol_df_A.apply(\n",
        "    lambda x: Chem.ReplaceSubstructs(x[\"mol\"], frag_NH2, Chem.MolFromSmiles(\"*\"))[0],\n",
        "    axis=1,\n",
        ")\n",
        "# renumbering\n",
        "mol_df_A[\"main_compoent\"] = mol_df_A[\"main_compoent\"].apply(show_atom_number)\n",
        "\n",
        "mol_df_A[\"main_compoent_SMILES\"] = mol_df_A[\"main_compoent\"].apply(Chem.MolToSmiles)\n",
        "mol_df_A[\"main_num_atoms\"] = mol_df_A[\"main_compoent\"].apply(lambda x: x.GetNumAtoms())\n",
        "\n",
        "assert len(mol_df_A.drop_duplicates(subset=[\"main_compoent_SMILES\"])) == len(mol_df_A)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "id": "kgQVY4pvzf0-",
        "outputId": "71812fae-05d4-4566-e385-42be3fcccfa3"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>ID</th>\n",
              "      <th>Name</th>\n",
              "      <th>SMILES</th>\n",
              "      <th>mol</th>\n",
              "      <th>MW</th>\n",
              "      <th>NH2_pos</th>\n",
              "      <th>main_compoent</th>\n",
              "      <th>main_compoent_SMILES</th>\n",
              "      <th>main_num_atoms</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>A1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CN(C)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>88.154</td>\n",
              "      <td>(5,)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCN(C)C</td>\n",
              "      <td>6</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>A2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCCN1CCCC1</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>128.219</td>\n",
              "      <td>(0,)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCN1CCCC1</td>\n",
              "      <td>9</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>A3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CC)CCCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>130.235</td>\n",
              "      <td>(8,)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCN(CC)CC</td>\n",
              "      <td>9</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>A4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCN(CCN)CC</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>116.208</td>\n",
              "      <td>(5,)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCN(CC)CC</td>\n",
              "      <td>8</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>A5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NCCN(CCN)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>146.238</td>\n",
              "      <td>(0,)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCN(CCN)CCN</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "   ID  Name        SMILES                                            mol  \\\n",
              "0  A1   NaN      CN(C)CCN  <rdkit.Chem.rdchem.Mol object at 0x11fc818c0>   \n",
              "1  A2   NaN   NCCCN1CCCC1  <rdkit.Chem.rdchem.Mol object at 0x11fc819a0>   \n",
              "2  A3   NaN   CCN(CC)CCCN  <rdkit.Chem.rdchem.Mol object at 0x11fc81a10>   \n",
              "3  A4   NaN    CCN(CCN)CC  <rdkit.Chem.rdchem.Mol object at 0x11fc81a80>   \n",
              "4  A5   NaN  NCCN(CCN)CCN  <rdkit.Chem.rdchem.Mol object at 0x11fc81af0>   \n",
              "\n",
              "        MW NH2_pos                                  main_compoent  \\\n",
              "0   88.154    (5,)  <rdkit.Chem.rdchem.Mol object at 0x11fc81540>   \n",
              "1  128.219    (0,)  <rdkit.Chem.rdchem.Mol object at 0x11fc82e30>   \n",
              "2  130.235    (8,)  <rdkit.Chem.rdchem.Mol object at 0x11fc82ea0>   \n",
              "3  116.208    (5,)  <rdkit.Chem.rdchem.Mol object at 0x11fc82f10>   \n",
              "4  146.238    (0,)  <rdkit.Chem.rdchem.Mol object at 0x11fc82f80>   \n",
              "\n",
              "  main_compoent_SMILES  main_num_atoms  \n",
              "0             *CCN(C)C               6  \n",
              "1          *CCCN1CCCC1               9  \n",
              "2          *CCCN(CC)CC               9  \n",
              "3           *CCN(CC)CC               8  \n",
              "4         *CCN(CCN)CCN              10  "
            ]
          },
          "execution_count": 16,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "mol_df_A.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3pe4NcHHzf0_"
      },
      "source": [
        "### 2.2. Structurize the df of component C, R3-NC"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "metadata": {
        "id": "j29xFv1ozf0_"
      },
      "outputs": [],
      "source": [
        "# extract the -NC group from mol_df_C molecules\n",
        "frag_NC = Chem.MolFromSmarts(\"N#C\")\n",
        "\n",
        "mol_df_C[\"NC_pos\"] = mol_df_C.apply(\n",
        "    lambda x: x[\"mol\"].GetSubstructMatch(frag_NC), axis=1\n",
        ")\n",
        "\n",
        "# replace the NC group with an open end atom *, that can later be connected to other molecules\n",
        "mol_df_C[\"main_component\"] = mol_df_C.apply(\n",
        "    lambda x: Chem.ReplaceSubstructs(x[\"mol\"], frag_NC, Chem.MolFromSmiles(\"*\"))[0],\n",
        "    axis=1,\n",
        ")\n",
        "\n",
        "# renumbering\n",
        "mol_df_C[\"main_component\"] = mol_df_C[\"main_component\"].apply(show_atom_number)\n",
        "\n",
        "mol_df_C[\"main_component_SMILES\"] = mol_df_C[\"main_component\"].apply(Chem.MolToSmiles)\n",
        "mol_df_C[\"main_num_atoms\"] = mol_df_C[\"main_component\"].apply(lambda x: x.GetNumAtoms())\n",
        "\n",
        "assert len(mol_df_C.drop_duplicates(subset=[\"main_component_SMILES\"])) == len(mol_df_C)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 692
        },
        "id": "LZC3eIf0zf1A",
        "outputId": "8b7d7edc-7465-4be1-9d2e-6d83b1de1e8a"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>ID</th>\n",
              "      <th>Name</th>\n",
              "      <th>SMILES</th>\n",
              "      <th>mol</th>\n",
              "      <th>MW</th>\n",
              "      <th>NC_pos</th>\n",
              "      <th>main_component</th>\n",
              "      <th>main_component_SMILES</th>\n",
              "      <th>main_num_atoms</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>32</th>\n",
              "      <td>C1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCC[N+]#[C-]</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>195.350</td>\n",
              "      <td>(12, 13)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCCCCCCCC</td>\n",
              "      <td>13</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>33</th>\n",
              "      <td>C2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>223.404</td>\n",
              "      <td>(14, 15)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCCCCCCCCCC</td>\n",
              "      <td>15</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>34</th>\n",
              "      <td>C3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>251.458</td>\n",
              "      <td>(16, 17)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCCCCCCCCCCCC</td>\n",
              "      <td>17</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>35</th>\n",
              "      <td>C4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>279.512</td>\n",
              "      <td>(18, 19)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCCCCCCCCCCCCCC</td>\n",
              "      <td>19</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>36</th>\n",
              "      <td>C5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CCCCCCCC/C=C\\CCCCCCCC[N+]#[C-]</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>277.496</td>\n",
              "      <td>(18, 19)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCCCC/C=C\\CCCCCCCC</td>\n",
              "      <td>19</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "    ID  Name                          SMILES  \\\n",
              "32  C1   NaN           CCCCCCCCCCCC[N+]#[C-]   \n",
              "33  C2   NaN         CCCCCCCCCCCCCC[N+]#[C-]   \n",
              "34  C3   NaN       CCCCCCCCCCCCCCCC[N+]#[C-]   \n",
              "35  C4   NaN     CCCCCCCCCCCCCCCCCC[N+]#[C-]   \n",
              "36  C5   NaN  CCCCCCCC/C=C\\CCCCCCCC[N+]#[C-]   \n",
              "\n",
              "                                              mol       MW    NC_pos  \\\n",
              "32  <rdkit.Chem.rdchem.Mol object at 0x11fc82730>  195.350  (12, 13)   \n",
              "33  <rdkit.Chem.rdchem.Mol object at 0x11fc827a0>  223.404  (14, 15)   \n",
              "34  <rdkit.Chem.rdchem.Mol object at 0x11fc82810>  251.458  (16, 17)   \n",
              "35  <rdkit.Chem.rdchem.Mol object at 0x11fc82880>  279.512  (18, 19)   \n",
              "36  <rdkit.Chem.rdchem.Mol object at 0x11fc828f0>  277.496  (18, 19)   \n",
              "\n",
              "                                   main_component   main_component_SMILES  \\\n",
              "32  <rdkit.Chem.rdchem.Mol object at 0x11fd58580>           *CCCCCCCCCCCC   \n",
              "33  <rdkit.Chem.rdchem.Mol object at 0x11fd585f0>         *CCCCCCCCCCCCCC   \n",
              "34  <rdkit.Chem.rdchem.Mol object at 0x11fd58660>       *CCCCCCCCCCCCCCCC   \n",
              "35  <rdkit.Chem.rdchem.Mol object at 0x11fd586d0>     *CCCCCCCCCCCCCCCCCC   \n",
              "36  <rdkit.Chem.rdchem.Mol object at 0x11fd58740>  *CCCCCCCC/C=C\\CCCCCCCC   \n",
              "\n",
              "    main_num_atoms  \n",
              "32              13  \n",
              "33              15  \n",
              "34              17  \n",
              "35              19  \n",
              "36              19  "
            ]
          },
          "execution_count": 24,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "mol_df_C.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FLZHcZamzf1A"
      },
      "source": [
        "### 2.3. Structurize the df of component B, R3-CHO"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "metadata": {
        "id": "byDLUSA7zf1A"
      },
      "outputs": [],
      "source": [
        "# extract the -CHO group from mol_df_B molecules dd\n",
        "frag_CHO = Chem.MolFromSmarts(\"[CX3H1](=O)\")\n",
        "\n",
        "mol_df_B[\"CHO_pos\"] = mol_df_B.apply(\n",
        "    lambda x: x[\"mol\"].GetSubstructMatch(frag_CHO), axis=1\n",
        ")\n",
        "\n",
        "\n",
        "# replace the CHO group with an open end atom *, that can later be connected to other molecules\n",
        "mol_df_B[\"main_component\"] = mol_df_B.apply(\n",
        "    lambda x: Chem.ReplaceSubstructs(x[\"mol\"], frag_CHO, Chem.MolFromSmiles(\"*\"))[0],\n",
        "    axis=1,\n",
        ")\n",
        "# renumbering\n",
        "mol_df_B[\"main_component\"] = mol_df_B[\"main_component\"].apply(show_atom_number)\n",
        "\n",
        "mol_df_B[\"main_component_SMILES\"] = mol_df_B[\"main_component\"].apply(Chem.MolToSmiles)\n",
        "mol_df_B[\"main_num_atoms\"] = mol_df_B[\"main_component\"].apply(lambda x: x.GetNumAtoms())\n",
        "\n",
        "assert len(mol_df_B.drop_duplicates(subset=[\"main_component_SMILES\"])) == len(mol_df_B)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 692
        },
        "id": "SepVI_GGzf1A",
        "outputId": "1cbf161a-6b63-4ea5-b903-f610ecc6f428"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>ID</th>\n",
              "      <th>Name</th>\n",
              "      <th>SMILES</th>\n",
              "      <th>mol</th>\n",
              "      <th>MW</th>\n",
              "      <th>NC_pos</th>\n",
              "      <th>main_compoent</th>\n",
              "      <th>main_compoent_SMILES</th>\n",
              "      <th>main_num_atoms</th>\n",
              "      <th>CHO_pos</th>\n",
              "      <th>main_component</th>\n",
              "      <th>main_component_SMILES</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>20</th>\n",
              "      <td>B1</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>256.386</td>\n",
              "      <td>()</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCC(=O)OCCCCCC=O</td>\n",
              "      <td>17</td>\n",
              "      <td>(16, 17)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCOC(=O)CCCCCCCC</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>21</th>\n",
              "      <td>B2</td>\n",
              "      <td>NaN</td>\n",
              "      <td>CC(CCC(OCCCCCC=O)=O)CCCCC</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>270.413</td>\n",
              "      <td>()</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCC(C)CCC(=O)OCCCCCC=O</td>\n",
              "      <td>18</td>\n",
              "      <td>(11, 12)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCOC(=O)CCC(C)CCCCC</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>22</th>\n",
              "      <td>B3</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(CCCCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>270.413</td>\n",
              "      <td>()</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCC(=O)OCCCCCC=O</td>\n",
              "      <td>18</td>\n",
              "      <td>(17, 18)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCOC(=O)CCCCCCCCC</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>23</th>\n",
              "      <td>B4</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(/C=C/CCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>268.397</td>\n",
              "      <td>()</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCC/C=C/C(=O)OCCCCCC=O</td>\n",
              "      <td>18</td>\n",
              "      <td>(17, 18)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCOC(=O)/C=C/CCCCCCC</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>24</th>\n",
              "      <td>B5</td>\n",
              "      <td>NaN</td>\n",
              "      <td>O=C(/C=C\\CCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>268.397</td>\n",
              "      <td>()</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCC/C=C\\C(=O)OCCCCCC=O</td>\n",
              "      <td>18</td>\n",
              "      <td>(17, 18)</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>*CCCCCOC(=O)/C=C\\CCCCCCC</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "    ID  Name                      SMILES  \\\n",
              "20  B1   NaN      O=C(CCCCCCCC)OCCCCCC=O   \n",
              "21  B2   NaN   CC(CCC(OCCCCCC=O)=O)CCCCC   \n",
              "22  B3   NaN     O=C(CCCCCCCCC)OCCCCCC=O   \n",
              "23  B4   NaN  O=C(/C=C/CCCCCCC)OCCCCCC=O   \n",
              "24  B5   NaN  O=C(/C=C\\CCCCCCC)OCCCCCC=O   \n",
              "\n",
              "                                              mol       MW NC_pos  \\\n",
              "20  <rdkit.Chem.rdchem.Mol object at 0x11fc821f0>  256.386     ()   \n",
              "21  <rdkit.Chem.rdchem.Mol object at 0x11fc82260>  270.413     ()   \n",
              "22  <rdkit.Chem.rdchem.Mol object at 0x11fc822d0>  270.413     ()   \n",
              "23  <rdkit.Chem.rdchem.Mol object at 0x11fc82340>  268.397     ()   \n",
              "24  <rdkit.Chem.rdchem.Mol object at 0x11fc823b0>  268.397     ()   \n",
              "\n",
              "                                    main_compoent        main_compoent_SMILES  \\\n",
              "20  <rdkit.Chem.rdchem.Mol object at 0x11fc82a40>      CCCCCCCCC(=O)OCCCCCC=O   \n",
              "21  <rdkit.Chem.rdchem.Mol object at 0x11fc83c30>   CCCCCC(C)CCC(=O)OCCCCCC=O   \n",
              "22  <rdkit.Chem.rdchem.Mol object at 0x11fc83ae0>     CCCCCCCCCC(=O)OCCCCCC=O   \n",
              "23  <rdkit.Chem.rdchem.Mol object at 0x11fc836f0>  CCCCCCC/C=C/C(=O)OCCCCCC=O   \n",
              "24  <rdkit.Chem.rdchem.Mol object at 0x11fc83760>  CCCCCCC/C=C\\C(=O)OCCCCCC=O   \n",
              "\n",
              "    main_num_atoms   CHO_pos                                 main_component  \\\n",
              "20              17  (16, 17)  <rdkit.Chem.rdchem.Mol object at 0x11fd59930>   \n",
              "21              18  (11, 12)  <rdkit.Chem.rdchem.Mol object at 0x11fd59070>   \n",
              "22              18  (17, 18)  <rdkit.Chem.rdchem.Mol object at 0x11fd593f0>   \n",
              "23              18  (17, 18)  <rdkit.Chem.rdchem.Mol object at 0x11fd59d90>   \n",
              "24              18  (17, 18)  <rdkit.Chem.rdchem.Mol object at 0x11fd59e00>   \n",
              "\n",
              "       main_component_SMILES  \n",
              "20      *CCCCCOC(=O)CCCCCCCC  \n",
              "21   *CCCCCOC(=O)CCC(C)CCCCC  \n",
              "22     *CCCCCOC(=O)CCCCCCCCC  \n",
              "23  *CCCCCOC(=O)/C=C/CCCCCCC  \n",
              "24  *CCCCCOC(=O)/C=C\\CCCCCCC  "
            ]
          },
          "execution_count": 26,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "mol_df_B.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GFPiqxU4zf1C"
      },
      "source": [
        "### 3. combine components"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 417
        },
        "id": "nOp6Jl46zf1C",
        "outputId": "1e0bd6b0-d942-4866-81ae-90c96a24a5f4"
      },
      "outputs": [
        {
          "data": {
            "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAGQAZADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoooJwMnpQAUVQ0iSWWx3T3kF3J5j5lgIK43EgcegIH4dT1q/QNpp2YUUUUCCiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKYkscjOqSKxQ7WCnO0+h9KAH0UUUAFZviBol8Pag08oiiEDF3ZSQBjuByR64rSrC8aMqeCdaZ2VVFpJlmOAOO5qZ/CzfDJOtBN21X5kfh/UbfU9T1GeF4lcRW6vDHuIUFWZWJKrkkMOnQKM+g6GvP/AIcyxyaprgjljciGwyFcHGYBjOK9AqKLbhd+f5m2YwhDEONN3Vo/+koKKKK1OIKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigDB1DxMlhe3Nq1tK7xPbqrLG5U+YwXlgpC4z3PNcnrHiJtE1C5lSJxHJC8xcSkBWyzKp9AWLc+rAd+PQZNPtZmlaSFWMpRnz/ABFDlfyIrm/+EMS4S7F2kDC5LK0e5nURnHyjIB7Z+uTnnjspyw2jlHbfz22/Fku5uaPeteWzb2DMhxu9f84NaNcBL4F1Tw/PJqPg3VTBNIxkuNPuyXtrhupI7oT6j9KitPipb/bv7I1uwfQtZXhoL1h5cnvHIPlYfj+dZ1qVOCUqc+ZP5Neq/wAmxpvqeiVyfxNmig+GniEzSpGGspEUuwGWIwAPcntWJP8AFCXUbp9I8JaX/berDhpUbbaQe7yf0B59c1Pp3w4k1K+j1fxzqP8AbuoId0VpjbZ2x9Fj/i+rde471zjOb+COq2Oo6x4ka2uFbdBp4VTwx2QbGIB5wGGM/T1r2SuM1f4a6PrMMczl7PVoGLQanYfuZo+SQOCcqOmDnj0rPk8SeJfBA8vxNaHVtKHyx6tZr86jt50fb3YcfU1vQw867cYWv2vq/Tv6b9hN23PQ6K5K38dWMpRzIhieQxAKjBgwYKQQehyw64656V01vdwXTzLDJvMLhJOCMMVVwPf5WU/jWcqU4/EmguieiiioGFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFY+qeIrPTb6LTwyzajKnmraKT5jR5wWAAPQ/StCxuXu7KKeSB4HcZMb9VNAFiiiigAooooAKKKKACiiigArjvGfhaHXLK4SbTor9WUtHG65IfHGDkEfUEV2NFAHOeF/D8GiW0cVrZpZ28abY4VGPxPv7nk10dFFABVTUbVru0KL94Hcue9W6KAPL7zwYY5EmlaVYw+Jh5fyvmZX3HnsBgkZ4+ldja6edBult9PUeTeXPmOgt2KxqI40PzA4H3M8jkt6Amt1kVwA6hgCCARnkdDTq6Xi6slabuhcqCiiiuYYUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQB578Qf8AiTeK/B3ihflSC+On3Lf9Mp12gn2UjP416FXLfEfRjr3w91qxQEzfZzNDjr5kfzrj3JXH41oeEtZHiHwjpOrAgtdWqO+Oz4ww/BgRQBs0UUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQBVvpbiNIvs8YdnlVWyu4BT1J5H9f6jzrSfAPjDwtp6Q+GfFcKW2TINM1G0DxRljkqJFO4DJPT616fRQB56fFvjrRcDXPBH26IfeudEuRLn6RN8361YsvipoGp3cWnW7XVpq8rhUsb+0eKRsnp6evc/Q9K7qvP/AIu2zQ+GLTxFBHuu9Bvob5CB8xQMA6/Qg5P+7QB22mz3Fzp8M13B5E7D54/7pz/n/wCtVqo4J47m3jnhYPFKgdGHcEZBqSgAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAorz3X/iDpXh3xd9gmn1G6aIme5SygaZLdCgULJhvl5G7gHr2zmt3R/iF4R17aNP1+yeRukUknlSH/gL4P6UAdLRR1rjvGPjKXSbu20DQrYah4kvf9Tbg/LAneWQ/wqP1oAu614vg0zXrDQrO0l1HVLpgz28DAeRD/FI5PAHoD1rbsr6K/hMkQcKMfeGOqhh+jCsTwn4Rg8N2k0k8xvtVvDvvr6UZaZvT2UdhXQRQRQKVhiSMHqEUD27UASUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABVLWNNi1jRb7TJ/9Vd27wN7BlIz+tXar31y1nYzXCQmZ0XIjXOWPpwCf0oA5L4UalLf/AA80+C54u9OL6fcKf4WiO0D/AL5212teX2nh7xto2s63qfhefTPsmoXr3MmnanG64c/eZHX+9weePyybx+IPiDRxjxL4F1SFR1udMZbyP6kDBUfWgD0KiuO074peD9VjYWmt2y3IGBb3W6By390Bhkn2UGuk03UG1BJma1lt/Lk2ASD73yg5HYjJIyM9KALtFFFABRRRQAUUUUAFFFZ2sJm2SVrxrZI2O4qGYtuUoBgEZO5gQOckCgaTbsjRoqtY3EE9uBbu7LH+7IkDB1IA4YNznGDz6571ZovcGnF2YUhzg469qWvPfE/iXUte1mTwb4Pl23gGNT1QcpYRnqoPeU84Hb65KgjY0jxppdxqV1pl7rGnJfwbFaD7RGGD7cOoGTnDA9z1xxjnqutcda/C3wbb6LDpkuhWlysY+a4njBmkY9WaQfNk/Ws5vhTDp3zeF/EuuaHj7sMdyZoB9Y36/nQB6FXC+L/Ft82qL4S8JhJ/EM65mmPMenxHrI/+1zwv09gczUY/ivpNr9mg1XQNSSc+St9NA0EsJP8AGVHycfjzjg11HgnwxpvhrRF+xTNdz3f7+6v5eZLpzyWYnnHJwO31ySATeEfCVj4R0o2tqzz3Mzebd3kvMtzKerMfzwO35kv1nwb4b8Q5Oq6JZXTnrI8QEn/fYw361uUUAedv8K4dKR5PC/iXW9D2glYUuPOtx9Y36/iav/D3QbDS7e8vFuZdR1e7ZZb3UpwoeYsNwUAM20AEfLmu1ooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigDi/id4Vs/EXgfVv9CgfUIbdprebywZFZPmAVuozjH41u+FdZXxB4U0rVlIJu7ZJHx2cj5h+ByPwrWIBBBAIPUGvPvhSTptpr3hVzhtE1OWOFT/AM8JDvjP45agD0KiiigAqje6taafMkVwzqzozrhCRheW6eg5q6SFBJIAHJJry2/1DVfiTrslr4Tu003R7Ftlzrn2dZGuJFORFED1UHknOPw+8Aep0V58E+KWiciXRPEsA7Mps52+mPkFJ/wtP+zOPFHhTXNFA+9P5H2i3X/ton+FAHoVcydZj8QaxPpOnRs8VmVkkvxgxpMrBlTH8XQ5wf8AGuRvvHD/ABD1hPDPgnUPLtfLE+parsKmOLONkasASx6ZxgZ/L0jStMs9H06KxsIljt4hgAdSe5J7mokpN26HRRnTpxcmry6dl5+fktu/Yx73wxeXTtcw6/eWd40hkZrZQsTEqqjMZznAQdT6+tZd63xH0hUazj0jX4gTvWQm1mYdsfwDv1ruKKFTSd0OWLqTp+zlZr0V16O1zzHUPFPjXxCI/D+neF9Q8P39ydtzqVziSC2j7mN14Zz0HT+o7Xwx4Y03wlo0em6bGQoO+WV+ZJpD1dz3J/8ArVPfaLb399b3cksySQMrKIyACQSRnjJ6njPetKrOYKKKKAIri2huoxHPGHTIbaehx6juPanQwxW8KQwxrHGgwqKMAD2p9FABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRVVtRskaRXuolMe7fuYALtGTk9sDn6UATzTRW8Ek88iRxRqXd3OFVRySSegrx7TtS8Sar421nxj4K0WG70i5jjtXF5P5BvmjyPMiyOAB8uW/nkDSkkuvi5qLW8DS23ge1lxLKCUfVJFP3V7iIHqe/1+73l3o7ta29rp1yun28ERjSOJGAUYAGArKAAOnHXHbIIByn/C0v7M48UeFdc0bH3p/J+0W6/9tE/wrodH8c+FteA/szXrGdjz5fmhH/74bDfpXQVy/iHwH4Y122ne88O2VxcFSQ6RCORj2G9Sp/8AHhQBzGp6je/E/VJ9A0K4kt/C9s/l6nqkRwbo94IT6erf0xu9F03TbPSNOg0/T7dLe0gQJHEgwFH+e/eqfhzTRo+jRadHZQ2cFvhIYoR8u3AOepJ5Lck5PU1rUAFNdFkRkdQysMEHoRTqKAPnrwppUmheMbPT3hdLeO6vdAuZAoBaFz5sD5IwSWLDnP3RXvWn6dDpscyQs7CWVpmLnJ3Hr/KvN/HNjNb+Jrt7VSZby1jvYNoyTcW7c4H/AFzJ/OvTrWb7TaQz7Gj82NX2OMFcjOCOxrKFRyk4tbHfisHGjRp1Yyvzf8D9br5EtFFFanAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFc94h8G6V4k0+SyvTdJFLMJZfJuGUvyMqefukAAj06YroaKAIbW1t7G0itLSFIbeFAkcca4VVHQAVNRRQAUUUUAFFFFABRRRQBmaxbvOLRhA08Mc2Z4lxl02sMYJGRu2nHt36VJo8M0GnhJ1ZD5khRHbcUQuSqk5PRcD26VeddyMvHIxyMj8qoaVpQ0xHUTvKX253dOBjPU8n/AdBU8utzV1m6ap9P+H/AMzQoooqjIKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKzdY1QaYlrhoBJPOsSrLKELZ7KCRuboAPfPQVpUAFFFFABXncaTeI/iv4j06a9ul0/T7C1jMEcnyCRyZN205AbAxnGccZr0SuG8BYu/EHjTVML+91Y2u4AZYQqFHPfrW0KLnTnUW0bfi7Cb1sdlaWqWdqlvGSUTOM47nPbAHXoOBU9FFYjCiiigAorO12/TTNEu7yS6itRFGWE0rKFU9s7uKqeFtch8QaW99b39teRNKQj27ZAGBwR1HOcA84xmgDcooooAKKK5U+LbEeNxoI1i08/g/ZvMTcPlOV+udpxnPXjGDQB1VFFFABRRWL4n1228PaUt/dX1vaRLMgZ52ADLnkdyTjPQE0AbVFZug341PRbe8F3bXQlBYTWzbkYZOMH6Y/GtKgAooooAKK5PQvFdtq3iu/0yHWbC5FupxBHIvmBt3t94AccdO/JwOsoAKKKKACiuY8Y+Jrbw3DZtPqtrYmaQp+/I+YbTyB14OOeB6kZrobOXz7KCbzI5C8asXibcjZHVT3FAE1FFFABRUc8ght5JSyKEUtukbaowO57D3rmfB/ieLxFJeiLV7C/EBCYtXztPc4wDg+vT070AdVRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQByWq+PLHw/O1lqcc8uqO7G2sbKEyyyx5+VgBwOhHJHQ1mvN8QvFEZFrFbeFLNiP3s4FxdFfUL91T7HkV3nlR+cZvLXzSu0vj5sdcZ9KfQB56Pg54auYJm1qXUdY1CZdrX95dMZUPqmMBcHpwfxqrb6/rfw4uY9P8WSyal4dZglrrgXLwZ6JcAflv/n29MqK4t4bu2kt7mFJoJVKSRyKGVgeoIPUUALBPDcwRz28qSwyKGSSNgysD0II6ipOleX3HhzxN4CvVHgeW2utIvJNp0rUZG2WkjfxRvnIXPVc/nni9F4B1vXnE3jXxNPdxnrpmnZt7UezEfM49zg1UVFySk7IDT13x7otutzpun6tDJq5hkMSwr5wjZVJyxHyjp0JpfCV5Z215qOlmO3sbs3JZrUNgvLsVpXXJywJIYH0YdK2LPQbDQtKmttC0+1tG8s7AiAbmxxuPU8+tUNb8JWvijSrUakGt9VgRXivbY7ZYJcDJVh2z26foa741cLb2KTUX10bv0dl08rt72ZFpbnS0V59YeM77wxqcGgeN8RmU7LPWlXbb3J7K/wDcfHrgdfYn0HORkVw1IqEnFO9uq2LRmahrlrp13HbTJKzuFOYwG2gttBIzkc+3PbOK53U/iDFLqD6R4Vsn1zVF4cwnFvb+8knT8B9Mg11eo6da6tp1xYXkXmW9xGY5FyQSp9xyKZpek6folillplpFa2ydI4lwPqfU+55qAOKsfhzcazex6r491H+2btDuisIwUsrc+yfxn3br3BqfVvhpaC9OreE7t/DmrgfftFHkTY/hki+6R9PrzXd0UAeeWnxDvtAu49M8f6cNLlc7ItUt8vZTn/e6xn2P14rpNf8AGvh/w1p0d7qOoxBJlzbxxHzJJ89Nijlvr0561sXdnbahaSWt5bxXFvKNrxSoGVh6EHg1xfg3wD4f0DVrq+tdCSC5Ylo5ZA7bBvcYTdnaMBTxz83pQBQ8vxt8QP8AWmbwn4ef+BT/AKfcL7npED+f1Fa4+FPgsaH/AGV/YcHl/e+0c/aN397zfvZ/HHtiuzooA818vxt4A5iabxZ4fT/lmx/0+3X2PSUD8/oK6rw/418P+JdPkvNP1GLbCCbiKY+XJBjrvU8rj16e9dBXBeNfAOg69qMF/caDHc3SAs7xBkMpGMK5UjcD055GfQGgCG8+Id7r93JpngDThqkyHZLqk+UsoD/vdXPsv4ZqxpPwztGvRq3i27bxHrBGN92o+zwg9Vji+6B9R78V2tpZ21haR2tnbxW9vENqRRIFVR6ADpU1AHn198ObjR7yTVfAWo/2Ndud8thIC1lcH3T+A+69OwFS6T8SY4dQTRvGFg3h/Vm4RpmzbXHvHL0/A/TJNd5Wdrmkafrekz2WpWUN5Ayk+XNHvGccEY5z7jn0oAyfE/jzRPCzJb3Er3WpTf6jTrNfNnlJ6YUdAfU4rnP7A8XePv3nie5bQdDfkaPYyZmmX0ml7D/ZH6GtfwF4M0jwvaytaaOlpdORumcs8hBAO3e3OB6DA+pya7OgDjL/AOFnhG70yC0ttLTTpbbm3u7E+VPE397eOWP+9msj+2PGPgH5Nfgk8SaEnTU7OPF1Av8A01j/AIgP7w+pPavSqKAMSx8X+HdR0JtbttYs205BmSdpAoj9mzgqfY81yUnjTxB4zka18B2AhsclX17UIysQ/wCuKHlz7kY9R3qxqnw18N6h4wi1N/D9s7Aq0h2FYnOTkuoIVjjvg89Rzmu+REijWONVRFACqowAPQCgDi9H+F+g2by3esofEGqXC4nvdTUSk+yochB6Y5HrWdL4G1vwjM954B1ALbEl5NDv3L27+vlseYz+nqccV6PRQBxvh74jadqt+NH1W3m0PXl4awvvlLn1jfo49Mcn0p3iP4iaZot9/ZOnwza1rrZCadYjeyn/AKaN0Qeuefar/jLw3pniTQprfUNMivmUZhBU71b/AGGHKn36evFM8FeGtO8NaKLex0qKxYsRIRkvLgnDMx+Y/j+Q6UAc5H4J17xjKt149vwllkNHoVg5WFfTzXHLn6cZ6HtWlrHww8P3ohuNJiOg6lbri3vdLAhZfZgMBx655967WigDzdPGPiPwS62/jmx+16cCFTXtPjJQD/ptGOUPuOOwB6132n6jZatYx3un3UN1ayjKSwuGU/iKsSKHidWQOrKQVbo3tXNeEfD0Ggyag0Gj2+m/a2WWRbc/IzgsOF3EAbQp4A+8aAOnooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKAGPEkjRsy5MbblPocEfyJp9FFABRRRQBkeJ9MXV/Dl5YtYW1/5qYFvcfcbkd8jB9CCMGrmlabbaPpVrptoHFvbRrFGHcsdoHGSeTVuigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigD//2Q==",
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZAAAAGQCAIAAAAP3aGbAAAyNklEQVR4nO3dCVxU1f8//jPAILtssuiAiLjhHiLugCxqoqZJZoVZGWaZ/rJv2fIprW8pavVx+6R8DYuwUtTEDZcBRFMMRBBEAQEREBj2fZvt/h/D9T/xQTOEuc6c4fV8+OgxwXjPGRxe8z73nnsOj2EYAgBAAx11dwAAoKsQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYAEANRBYAEANBBYAUAOBBQDUQGABADUQWABADQQWAFADgQUA1EBgAQA1EFgAQA0EFgBQA4EFANRAYKlfc3MzwzDq7gUABRBYatbU1BQUFNTY2KjujgBQAIGlZqGhoVOmTEFgAXQFAkvNZs+e3adPH6lUqu6OAFCAh7MnAEALVFgAQA09dXegdyssJN99R6qqyLRp5M03iQ4+PwAeB78h6tPcTAIDSXAw2b+fiERk0yZ1dwhA0+EclvqcO0diY8nWrYrHUinx8CDXr6u7TwAaDRWW+tTVEQuLB4/19IhMpub+AGg8BJb6jBpFkpMfPM7PJ/b2au4PgMbDSXf1cXUlgweT118nI0aQM2fIN9+ou0MAmg7nsNRHKiX19aSsjDQ2kv79iakpMTNTd58ANBqGhOrzxx/Eyoq8/TbR1ycCAZk+Xd0dAtB0CCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqYKa72jTq6cnGjJEMGGCsqyseM0bq5GSl7i4BaDhUWH9JS0vbv39/SUnJ02numlRqnp4eWFx8RyYzT0+fee/e02kXgF4IrL9cuHAhKCgoKipKIpE0NDQ0NTXJ5XJ1dwoA/oIh4V90dXVTU1Obm5szMjJ4PB4hhGEYc3NzBwcHPT38oADUDxXWXxYuXHj79m0vLy+5XC5rJ5fLa2trb9261dbWpu7eAUBvDayaGlJR8eBxYSFpbVVsv1xRUTF69Oj6en5pqf7//y0DmUwulUpzcnKwrAWA2vXSwDp9mvz224PHX35J7twh9+/fZ89YXb7c97XXhtfX6xJCNm92lEgUY0OJRFJTU6PmTgP0er00sDphGKbj3suTJ9fv3j2g4xPkcnl1dTUXTUskkrt373JxZADt03vPJYeHk4QExYNr18iqVVIdnb/WMhw3rvH6ddP0dOOOz1f5aSxZ+yLuCQkJKSkphJCWlhbVHh9A+/TeCuvVV8nBg4o/3t6K64OdTlGtWXP/++8HyOVELidXrvRlryGqsHWhUPj222+zxR27T31eXt5bb71VoTy1BgAP6b2B1ZGOjk5bm/7Fi+YVFXz2K9bWEk/P2vR0kzNnrNeudVm+fHhurmomoufm5r7wwgv+/v45OTkuLi6hoaG///77+vXr+Xx+aGjo0KFDt2zZgouSAI/USwNryBDFnjUsT0/FSsUikcDFpS0szN7RsdXJqZUQ8sILFYGB5QYGcktLaUaG8YIF/YKDSXl59xttamrauHHj6NGjDx8+bGxsvGHDhps3b/r4+BBCgoOD09PTAwICamtrP/roI/Y5qnqxAFoDm1A8wDBMcnLenj0m77xT3Olbra16UVEuu3YZt7URExPy/vvk449Jnz5PdvCIiIj169eLRCIej/fKK69s3brVzs6OPXX1448/rly5kh1yRkdHr1u3Ljs729JyiLf37a+/1hs2TMWvFIBeCKwHiotJWBjz/PP3xOJaph07VCSEODg4WFtbZ2eTdetIdLTiyUOHkh076mfPfrDJTUtLS0lJSX19vVwu19HRMTExsbe3NzExYb977dq1tWvXXr16lRDi7u6+Y8eOyZMnK9v99ddfCSG+vr42NjbsVyQSya5du86dm33+vCufT1avJp9/TszNFd+Sy+Xl5eUVFRUSiYQQwufzra2tbW1t2X4CaD0E1gMHDihmkFpYkOXLW6qrq1tbW9nosbCw6HhfTkwMee89cusWM2rUdBsbg+3bt9vY2BQXFyszjqWjo8MG0Mcff3zgwAGGYQYMGLBp06agoCD2pp9/VF1NvviC/Oc/ig2hLS0VmRUcLM7LuyORSDre4aijo6Onpzds2DB9/QeTXQG0GALriUkk5Kefsj/8cFJtbS2fzw8MDHzzzTdNTU07Pqetre3QoUP79+9vbGw0NDRcs2bNp59+2uk5XZGZqcjHc+cUj52d29atK5o0qa7Tc3g8Hp/PHzlyJOos0H5saQBPqqqqas2aNWxGmJmZvf/++0lJScntvvvuu/79+7M/3rlz5969e7eHbUVGMgKBlBCGECY4uNjOrk0ovJGcnPzaa6W//HLb379aKLwpEokYhpkxQ0UvD0Aj4TO5mywtLTdt2nTkyJHJkyfX19d/++23S5Ys2b1794oVK9atW1dSUjJs2LCwsLCIiIhBgwb1sK3AQHLq1J333y+ysxO7uTX27y/euVPQXsfpyGQ8sZgnlcorKyvbrw+o6OUBaKTeO9O951pbW52cnHbt2hUbG7tjx478/PyCggK5XG5pablq1aoFCxbo6uq2qihC5PLWpUubFy+uyM83HD68uaVFJzn5rwHmgQO2JiaMnR0CC7QcAksFfHx8pk+fvnfv3tjYWCsrq507d7KXCFV+fpDPf3DAd94p/uCDwUOGPLibx8ur1spKPmKE3enTqm0QQLNgSNh9RkZGysf6+vpr1qyJjIzcv3+/ckKDrq5ux+f0hKGhYcf/NTeXPvtslVBowf6vQNA2fDgZPpywp93l8r8WzwHQJgis7jM0NOw4maCmpmbPnj2dnmNm9mCuVg/Z2NiwM0v5fHnfvop7DxcsqBwxoonPl9vaivX1eewsCvZ02c6dZN8+EhWlkpYBNAimNfRIY2NjTk6OcmLU8ePHFyxYwD7W0dEZNGiQOTvjUxWysrKam5sf/vfi8XiGhobDhw/vOMMrJYVkZJBly1TVOIBGQIXVIyYmJoMGDdLR0eHxeM3NzSUlJYWFhTweT0dHRyAQqDCt2u9/HGJsbNxp0YjCwsILFy5YWlp2TKvCQsUKhUuXqrBxAI2ACksFJBJJeXl5bW2tVCrV1dU1MzOztbXt80R3G3ZZTU1NRUUFu3iWgYGBlZXVoUOHvLy8Ro4cyT5BLievv04mTSLjxin+C6BNEFh0Ky4ubmxsvHXr1qJFi9TdFwDOYUhINwMDg5ycnHnz5qm7IwBPAyosAKAGKiwAoAYCCwB6za05J06cqK2tHT16dHZ29siRI0ePHq2ijgEAqLrCmj9/fk1NTVlZWWpqakNDQw+PBgDAyUl3oVB45syZ/v37Ozg41NbWjhkzJjs7e/ny5d07GgAAh4Hl4eGRlJQUFhbm5eU1aNCg+/fvCwSCLq7/CwDwVIeEs2bNIoTcuHHD2dmZx+M5ODggrQBAowPrHLveOACAJg8JZTKZjY1NdXV1Xl6es7OzqjsGAKC6CktXV9fb25s9+97tgwAAPKVpDRgVAgA19xLev3/fwcHBzMyssrKSz+ertGMAACqtsAQCwfDhw+vr6//888+eHAcA4GnMdMeoEACeGgQWAPSa9bBaWlqsrKza2tpKS0vZjVsAADS0wjI0NJw2bZpcLo+NjVVRlwAAOFsPC6NCAKBmieSMjIzRo0fb2dmVlJTgdkIA0OgKa9SoUQ4ODiKRKD09XRVdAgDgcolkX19fjAoBgI7AwmksAKBmm6+ampp+/frp6upWVVWZmJioomMAANxUWBYWFk5OTmKx2MfHJykpSSaTqeSwAACcbPPl4+NDCElKSvLw8DA3N/fz89uyZcv169flcrmqmgCAXq5HQ0KGYQ4ePGhqauro6Jienn7t2rXy8vLr16/n5OQon9OvXz9vb++ZM2d6e3sPHTr07w4lEokuXbo0Z86ctLQ0uVw+Y8aMbvcKoKPm5mYjIyN19wI05hzWjh071q5du2/fvjfeeENHR1GyiUSiP/74IyYm5vz58/fu3VM+087Obvr06b6+vv7+/k5OTh0PUlVVJZFILly4cOXKlZUrV2J/Q1CJkJCQ2NjYQ4cOWVpaqrsvoO7Aksvlu3fvHj9+/Lhx486ePRsYGPjwc+7evXv58uUrV65ER0ffv39f+XVnZ+epU6dOmzbt2WefFQgEra2t4eHhCxYsiIqKamlpee+997rdKwDWli1bduzYwefzv/zySzs7OxcXl4yMjAULFqi7X6CmwJLJZDdu3NDX1x84cGCfdo95MsMwGRkZce0uXrxYV1f3oAc8nr29/ahRo5YtW+bu7p6bm2tlZeXh4dHtXgEQQr5tp6ur+5///CcgIOCnn35avnz5Tz/99Prrr6u7a6Cmrep1dXXd3Ny6+GQejze63dq1a9mkYyuvc+fOlbRzdXV9+eWXH3OeC6CL9uzZw6bVrl275s+fz36RPV8BVFPPPyGbdGvXro2MjKyoqPDz8yOEFBYWqqUzoGVCQ0P/93//V1dXd/v27c899xwhJDs729TU9Pbt26amppWVleruIKh74mgPZWZmurq6Wltbl5WV4WNQ6zU0NOzevXv9+vWHDx82NDRUVkAqsW/fvg0bNujo6Pz73/9+5ElVoJpGpMOIESMGDhxYWVl548YNdfcFOGdqampjY6OjozN27NiOM2B67pdfftm4cSOPx9u8eTPSSitpRGARQvz9/XE3Yi8hl8srKysrKirYuSyqOuyePXsiIiLYi4NBQUGqOixoFI0YEhJCjh49unjxYk9Pz/j4eHX3Bbgll8vv3btnbm7e0NBgYWFhZmbW82P+8MMPwcHBPB4vIiLipZdeUkU3QRNpSoXl4+Ojp6eXkJCgnO4A2kpHR8fZ2fnAgQMeHh52dnYrV678+eefi4uLu33A8PDwlStXMgyzefNmpJWWYzTGlClTCCFRUVHq7ghw7o8//jA0NOz4PuTxeGPHjn3vvfdOnjxZV1fX9UMdOnRIV1eXELJ582YuuwwaQYMC64svviCErFq1St0dAW5dvXqVHQbOnz///PnzISEhAQEBpqamnWa9rF+//sSJE48Pr8OHD+vpKeYSfv3110/xFYDaaFBgJSYmEkIGDRqk7o4Ah1JSUiwsLAghL7zwgkQiUX5dIpEkJyeHhIT4+vp2vGVCT0+PDS+hUNja2trxUEePHmXT6ssvv1THSwE10JST7uy5WDs7u4qKipycHBcXF3V3B1Tvxo0bPj4+1dXVixcv/u2339i4eVhzc3NCQkJMTMzly5eTkpKUVxKNjIymTJni6+s7ceLEysrKl19+WSKRbNiwYePGjU/3dYD6MJpkyZIlhJDdu3eruyO91507dxiGEQqFd+/eVe2Rb9y4YWVlRQhZtGhRx9rq8aqrq48dO/buu++OHDmy4/uWnWD8ySefqLaToOE0K7D2799PCJk3b566O9JLFRcXf/3119XV1Tt27Kivr1fhkTMzM21tbQkhc+bM6TSy67qysrLIyMjg4GArKys9Pb0XX3xRhT0EKmjQkJAQUlpaOmDAACMjo6qqqsev/QAcCQsLCwoKSktLS0hIWLt2rUqOmZ2d7eXlJRKJZs+eHRUVpZJ/WXZZPrFYrK+vr4o+Ah00ZR4Wi11npqmpKSEhQd196Y2q2mVlZWVnZ0+YMEElx8zJyZk5c6ZIJPL39z927JhK0urWrVu7du3Kzs4+duzY2bNnVdFNoINmBRZ2DFMvKyurDz/80MrKqri4OC4urra2tocHzM3N9fb2Likp8fX1jYqKMjAwUEk/R44caW1tPWzYMGtrazs7O5UcE6iAwIL/UldXN27cuI8++ujzzz+3traeMGHC2rVrDx8+3I07EAoKCvz8/IqLi6dNmxYVFdVppmhP5ObmZmdnx8TECIVCnDroVTTrHBYhpK2tzcrKqrm5ubi42N7eXt3d0RSydk1NTX369OFuS4X6+no/P7+kpCQDA4PBgwfn5OSIxWL2W/r6+hMnTvTx8fH29p40adI/xkRhYaGXl1d+fv7UqVPPnj2L3SpBNRjNM2fOHPbsr7o7oikKCgo+++wzkUgUHR29a9cujlppbGxkNysaOHBgfn4+wzDNzc1CoXDDhg2+vr58Pl/5njE0NJw6dSo7mbOtre3hQxUVFTk7OxNCJk+erNqrjdDLaWJgvfvuu4QQPp+/bdu269evy2QydfdI/dj4PnLkSGVlJRfHb2pq8vLyIoQ4Ojo+cgZWQ0ODUChcv369m5tbx0UWjY2NfX19Q0JCkpOT2X+p0tLS4cOHE0ImTZr0RHcFAlAZWCkpKR1HEKampspfCblczvQ+paWlH374YXFx8cGDB7k4fnNz88yZMwkhAoEgLy/vH59fXl5+6NChlStXdlqAv1+/flOmTGH/7dzd3Wtra7noLfRmGnQOKy4uLi0tbf78+cePHx8/fnxSUlJOTk5cXFx+fr7yOXZ2djPbeXt7s4OOh1VVVZ09e3b06NFjxoz59ttv33///af4IujT1ta2aNGi6OhoW1vb+Ph4tjjqukfuQdmvX7+srCxsBQgqp0GBRQjZu3evl5dXUlKSo6MjO0IhhJSUlFy5ciUmJubs2bMdN6qwt7efNm2ar6/v7NmzHR0dlV+Xy+UymSwsLMzW1jYjI+Ozzz5Tx0uhg1gsXrRo0enTp21tbePi4lxdXXtytNzc3C1bttjY2KxYsWLQoEGq6yaA5gXW3r17pVKpn59fWlqaTCZbunTpw8+5e/duTLu4uLiqqirl152dnX3bzZw509DQcO/evS+99NLRo0dv3br16aefDhgwgGiBq1dJSAhhpzJt3UoGDuzh8cRi8eLFi0+ePGljYxMXF9fpZr3uiY6OdnJycnV1jYyM9Pf3Nzc37/kxAf7CaIz6+vrq6mqpVPrIC0+dyGSylJSUb775Zu7cuR2XUuLxeObm5vv27cvIyGBvjmO0Q3MzM2ECU1OjeJyezsya1cPjicVidg/kfv363bx5U4Wn206ePJmWlrZhw4Z79+6p6rAALA2qsLqN3ZaVrbzi4+OlUqmtra1IJCLaJDWV/Oc/5IcfHvyvuztJSCAy2YOC6wnJZLJXXnnl4MGD5ubmMTExXd8N9x+Vl5cnJSWJRKLCwkJ/f/9p06ap6sgAPd35WUOwC1Syy7ylpqa6ubnV1dW1tLSocGq1Ruj00XL7NpkwgYwdS3x9FX9mzCBduw1YJpMtW7bs4MGDffv2FQqFKkwr9j5nHR2dFStWlJWVdax8AVSD0Trjxo0jhAiFQkabNDcz7u4MOwkzM1MxJPz5Z0ZHhyHkwR9TU2buXOabb5iUFObvZ65JpdJXXnmFENK3b9/ExMSn+hIAekwbKqxOZs2adePGjXPnzvn6+hLtUFxMBgwg33xDli4lJiZELCbff0+cnclzz5HERBITo/iTkkJOn1b8IcTLxYXv5MRehXjmmWd4PB57GIZhVq1adeDAAWNj45MnT06cOFHdLwzgCTFaJy4ujhAyevRoRjt89x1jbMzExv7D04qLmYgI5vXXm/77tJG9vf3LL7/8ww8/5OTksNuLGhkZxcfHP6XOA6iUNpx070QsFltZWTU2NhYWFjo4OHDdXGFhIcMwDg4Of/75p7u7e8d77lRg+3by3ntER4f89BPp8m7Gd+/evXDhQly7ThcfDA0Nz549y94zCEAdjVtepuf09fW9vb0JITExMU+huaqqqiNHjvz000+WlpZhYWGqPPS+fWTdOsLjKa4PPsne687Ozm+88cYvv/xSWlqal5cXGhoaGBho2u6rr75CWgG9tDCwnvKiWg4ODpaWlo2NjcOGDVMuxqICYWHkrbcUD3bvfvCgW5ydnYODgyMjIysrK6urq9etW9fY2KiyTgI8XVo4JGTvERkyZIiFhUVFRQW7LTB3vv7666FDh7q5uV28eHHy5MlPei/eo/34I1mxQjGPYedOsnq1Cg7Yvp+7o6NjVlaWk5PTrFmzOq64AEAL7XzXuri4DB48uKam5vr161y39emnnwYGBlZVVaWmpt64cUO5iV63/RYRcWvzZiKXk+++U1VaEULYkWB+fn56enrHu5oAKKKdgUUI8ff3J4Q8nR0K4uPjJ02atGvXrqVLl5qbm/v5+W3cuDEmJqYb4fXLL78Evfaad3V15c6d5P/9PxV2MjU1NSsra8yYMfr6+qpaWx3gaWO0VFRUFCFkypQpXDd09epVMzMzdt2uESNGKCc9EULMzc0XLFiwY8eOmzdvdmUlryNHjrCbIX/11VdcdxuARtp5DosQ0tDQYGVlJZfLy8vLuVuYKTU11dfXt7q6OjAw8Ndff9XT06uoqIiPj798+fKVK1c6Dkj79evn5eU1derUadOmPfJumN9///3FF1+USCRffPHF559/zlGHAejGaC9PT09CyJEjRzg6fmpqKrv3+vPPP//IvddLSkrYnYoH/vdSMHZ2doGBgaGhocr1DKKjo9ltHT744AOOegugBbQ5sDZt2kQIefPNN7k4eFpaGptWCxcuFIvF//j8rKysPXv2BAYG9uvXr2N4DR061NXVlR0Jzp07l4uuAmgNrR0SEkJSUlLc3NwEAkFRUZFqj5yVleXt7S0SiebMmdON3YyVyxBGR0c3NTURQgwMDFpbWxcvXnz48GFCSE1NjYWFRVlZmbW1NdfTMgBowmgvuVxuY2NDCMnMzFThYbOzs9kNE2fNmtXa2tqTQx05coTdb3nz5s3sSXqJRFJUVPT5558XFxdv3779+++/V13HAaintdMa2NVH/fz8VDu5IScnx9vbu7S01M/PLyoqqofbDvv5+fH5/Lq6ulWrVg0ZMqS2tjYpKUkgEDg6Okokkvr6ekzvBOhIy38fVHuPTm5urre3d0lJia+v7/Hjx3s+m8nMzGzSpElSqTQuLk7Z1YqKitLSUpFIJBAI2AoRAB5gtNqdO3d4PJ6JiYmPj0/HzT67oaCgwMnJiRAybdq0hoYGVfXwq6++IoSsXLny5MmThBAPDw9VHRlA+2jzSff6+vpZs2b9+eefPN5fL9Pa2tq73cyZM4cNG9bFQxUVFXl6eubn50+ZMuXcuXMd93ntoeTkZHd3dycnp4yMDCsrK6lUWlZWxl5/BIDOGC3V1NTEzsNydHRMSkpi50OxJZKSra0tOx/qkZuzKxUVFQ0ePJgQMnny5Hp2kWLVkclk7LgvKyuL3X750KFDqm0CQGtoZ2A1NTWxS2I5ODh0CqO8vLzw8PDg4GCBQNAxvJydnYOCgkJDQ9kF+ZREIhG7AMMzzzxTXV3NRW9feuklQsiOHTu2bt3q7+QU+/HHXLQCoAW0cEjY0tISEBAQFxcnEAji4+PZ4ujx86FiY2Orq6sf3pbVyspq9erVmZmZ48aNi42N5egWn99/+eXOrl3zn3lm+Ftv6Ywdq1i+vahIsW4fAPw3bQsssVi8cOHC6OhoW1vb+Pj4Lq5OJZPJUlNT2TWFL1++zE7mVBo7dmxcXBx3NyQSkYj0708MDUlVFRk8mJSUkJs3yahRXDUHQC2tCiyxWPz888+fOnXKxsbmwoULrq6u7F3Qx44d69u3r76+fnZ29tq1azsuqPAwqVSalpYWExPz+++/JycnDxgwICYmZujQodx2fdw4kpZGhEJy4AAJD1dskPP++9y2CEAh7ZmHJZFIAgMDT5061a9fv7i4ODat2FVf5s2bV1NTM2fOnK4cR09Pj92TNTExsaGhobCwcMiQIT1flu8ftE/CIufO/fUAALQ1sGQyWVBQ0IkTJ6ytrWNjY0eOHKn8lkgkWrNmjYuLy5dffimXy7u+7HpjY+OpU6d++OGHEydOnDp1inBKmVN+foo9ci5dIv89LAUALRkSymSyV1555eDBg+bm5jExMSrce/3u3btpaWleXl7x8fELFy4k3BGLiZUVaWxUnG5fuJAkJ5PoaNK1khCg96C+wpLJZK+++urBgwf79u17/vx5FaZVRUVFWFjYoEGDsrKyysvLCaf09YmXl+JBTAyZPVvxAKNCAKorLJFI1NjY6OTkdOPGjbFjx/L5fLlcvnz58oiICDMzM6FQSPfe67t3k3ffJS++SN55h0yfToYPJ5mZ6u4TgGahqcISiUTHjx8vKiqSyWQnT55kGGbVqlURERHGxsYnT56kO60IeVBYXblCPDwUu3u1r0kPALQGlrOzs6mp6aBBg6qqqgwMDN5+++3/+7//MzIyOnXqlDbsZuziQuLjye3bii1Us7PJmTOktVXdfQLQLDQF1r59+8zNzTMyMmprayMiIvbu3WtkZBQdHe3Fnv3RAp6eiq29dHTIhg3E3l4xPAQASs9hdRQZGfnmm2/+/vvvPj4+RGtIJGTSJKLca8fPTzGP1NZWzb0C0BiKvQ9o9MILL8ycOdPa2ppok/p6YmHx1//a2pLKSgQWAJVDwk60La3aV3dXJBQ7q55hyJ075L/XwwHo5WitsLTWunVk6VIybx6JjVU8MDZWd4cANAit57C0WVERycggQ4cqVm4AgA4QWABADQwJNVFKSkpDQ4O9vf0ff/zx/PPPm5ubq7tHABqB4pPuWmzcuHG5ublCoXD8+PFHjx5Vd3cANAUCSxOx+6cuX748Ly8PWxMCKGFIqIkyMjLMzMxEIpGZmZm/v7+6uwOgKXDSHQCogSEhAFADgQUA1EBgAQA1cNJdo8nl8uLiYoFAcPny5ezsbENDw4aGhnHjxk2aNEndXQNQA1RYGq2goMDR0XHo0KHTp0/n8XiVlZUrVqxISUlRd78A1AOBpdFycnIIIY6OjteuXbt165aTk1NYWNj06dPV3S8A9cCQUKPl5ua2L57s4t5O3d0BUDNUWBRUWC4uLuruCIBGQGBRUGENGTJE3R0B0AgILI2GCgugI9yao7lkMpmRkZFEImlqajI0NFR3dwDUDxWW5iosLBSLxQKBAGkFwMJVQs2Vl2c7fnysm1uzujsCoCkQWJrrzh2j1NSZap/MkJKSkp2dvXDhwiNHjowYMcLNzY2jhiQSyfHjx52cnFpbW8vLyxctWsRRQ4SQixcv1tfXz5s3LzU1VSqVcjdlpKGh4cyZM6NGjWpoaCgvL583bx6nL6qurs7Dw+PSpUuTJk1ycHDgrq1Lly7Z2tqWlJRUVVUtXryYPC0YEj6Z0tLSAwcOlJWVPYW22k+4K3awVy8XF5fs7OzTp09PnDjxypUr3DWkp6fn6OiYmpo6YcKE2tralpYW7tpycXG5du1aW1tbRkbGzZs3uWvIwMDAxsYmPT09PDycXZeROy4uLsnJydnZ2fn5+ZmZmdw1VFVVlZ6enpeXl56erqenV1BQQJ4WBNaTCQ8P5/F4crn8KbTVPqWBqH1KQ2tr66hRoyQSSV1dnUwm464hsVjs4uJSW1t7584dGxsbTs/c8fl8MzOzixcvMgxTWFjIXUPNzc3PPPNMaWmpk5NTUVFRW1sb1y8qMzNz6dKl7PVljly4cEFPT6+oqEgqlTY2NhoZGZGnBUPCJ2Nubu7m5hYTExMUFMR1W+xbTu2BVVNTY2RkNGfOnHPnzi1ZsoS7hvh8fkJCwsKFCwsKClpaWhoaGkxNTTlqq6ioyN3d3dPTkxCSn5/PaYV1+vTpJUuW1NbW1tTU9OnTh7u2itpf1Pjx4+Pi4pYuXcpdQ4sXL25tba2pqamrq6uqqurXrx95WjCt4cmIxeKEhIRp06bp6XGb9TIZMTJSbALd1ETUfpHwxx9/jI6OHjly5MaNGzltKDc399NPP9XR0dmyZYujoyOnbX388ce5ubmenp6rV6/mtKGUlJSQkBA+n79z504rKyvuGpLJZO+++25FRcWiRYs4DSxCSFxc3J49e4yMjL7//nvjp7ndLwNdI5VKjx8/PmPGjPnz55eVlXHdXF4eQwjj4MCoV25ubmBgICFEV1eXEOLj43Pz5k0uGpJIJKGhoZaWluzb0tzcfPv27VKplIu2MjMzZ8+erfwVCAgIuHv3LhcNtbW1bd++XTli6t+/f3h4uFwu56KtpKSkyZMnsw3xeLygoKDS0lIuGmpsbNywYYO+vj7blouLS2RkJPO0ILD+WXFxcUhIyMCBA9m3AiFk8ODBt27d4rTRc+cUgTVzJqMu9fX169evZ4cwpqamY8aMYbdH5PP5a9eura6uVmFb58+fd3V1ZX8BhgwZopzZP3bs2AsXLqiwoYqKirfeeosNX0tLy8GDB7NpYmho+NlnnzU2NqqwrcOHDzs5ObEvZNiwYcrHU6dOTU5OVmFDRUVFL7/8MvvOtLGxGTJkCJsmZmZm27Zta2trU1VDMpksLCzMzs6O3dhp2LBhAoGAfVGzZ8/OzMxkuIfA+ltSqfTkyZMBAQHs+5t92wUHB7PvPFNT0yNHjnDX+rVrzLJlzNatzNMnl8vDw8PZ9yWPxwsMDCwsLGQYpqqqas2aNexY2NLScvv27RKJpIdt5eTksBVcp8/qEydOODs7KyugvLw8lVRw1tbW7OXI4ODgiooK9tMoKCiI/W3v379/aGioTCbrYVu3b9+eNWsW2/kRI0acPXv24Z+qSiqg5ubmkJAQ9kyfvr7+mjVr6uvrH/NTVVUF5+7unpCQ8JifKncQWI9QUlISEhKi/EjU19cPDAwUCoVsMd/Q0MCee+bxeOvXr1f5sKW0VBFVLS2Kx7/9xmRkME/TI9+XfzeeGj58+JkzZ3oysmArOGNj4w0bNrSwr/m/x1MP/zZ2Q0xMzKhRo9g+P3JU2+lVX7lypXsNsZmurOAeHtX+46vuuhMnTgwaNOgxo9pOrzo9Pb17Dd2/f1+Z6QMGDHh4VPvwq+75J9nf6XWB1dra+tVXXz2yfJVKpUKh8NVXX504caJyeLJt27by8vKHnxwaGsrn8wkhXl5eqj2llZPDDBzIfPaZ4vG//sXExTFPR0lJyerVq9n3pUAgOHDgwGPOtvSkAnqiWqO4uDg4OJidwdSNCujOnTvKWmPIkCGPqTX+rq7sIrFYrKw1+Hz+42uNHlZAKSkpM2bMYP/6uHHj4uPj/+6ZD1dAj3wzP76CMzExYUfN69evf8xnRse6siefZI/X6wKrurp6z54958+f7/jF0tLSb7/9dsKECfbt/P39lyxZEhsb+/jzo5cuXWLf3w4ODomJiarqYU4O88YbTEAAk5X1lAKrpaVl9+7dQ4YM8fT0/Mf3ZacKyMzMTFkB1dXV/ePfSkxMVC5I/8gK7pGuXbs2ZcoU9m9NmDChKxVQQ0NDp1qmtbX1H/8WWwEZGBg8UQUkFApHjhzJds/X17eL1yViY2NHjx7N/q2ZM2d2pQKqrKx8fAX3SN0by584cUI5yOj6dQmVj+V7e2BVVVWdOnVqz5497EnES5cuBQcHOzo6slE1efLk3bt3d30cfv/+fXYoYWBgsG/fvh73jQkLexBYWVnMnDnMp5+qJrBkMtmBAwd+/fXXw4cPf/PNNx0/Zo8fP+7u7s6+/ODg4CcqK9i6TFkB2dvbP6YC+seRxePJ5fLIyEh2rgNbARUUFPzdiw0PD7e1tWXPDQcFBYlEoid6UQUFBcp5dg4ODuHh4T2v4B5TAbHzmB5fAYnF4u3bt/ft25et4NasWVNTU/NEbXUay0dHR3elghs/fjw7vbbruvdJ1kW9LrDYIquysvK7775T/qI6OjquXLnyjz/+6MYlZ4lEsn79evZfNzg4uHsXZZKTmeBgxshIcWXwt98UgcUwirQaO1YRWMXF3TjkI/q5c+fO/fv3b9++XSgUMgxz8+bNhQsXsj8BPz+/q1evdvvgycnJU6dOZX8Ibm5uly9ffvzIoqGhoXsNdayAjIyMHq6A/vzzTw8PD7YnEydO7MmLiouLGzNmDHsob2/vtLS0jt+tqalRXkU1MTHpYgX3SNXV1coKyMLC4uEKqFMFl9GD85qdKqDc3Ny/q+CsrKx6MrOk41j+8Z9kT6Q3BhZ7EqF///7dKKn+TkREBHuB3M3N7d69e138W9XVzPbtjKurIqcIYXR0mNmzmaNHHwRWczMzaBDz889M376KOOvJ5emWlpZ33nlHKBTevn1769at2dnZ//rXvwQCgb29vaur6759+3p+6YCtgJSTPwIDA9mfQ6eRRX5+fg8bYm+mebgCKioqUlZwAoFAJTOe2GKNrYDYYq2srKznFdwjZWZmzpkzR3k9+vTp0wzDZGdnBwQEsF8cOnToyZMne95QxwqILdbq6uoeruBqa2t73ta1a9eUn2QTJkzo9EnWDb00sBiG2bZt26VLl1Q4iy81NZW9amNtbc2WMI9x9Wr2smWMgcGDqLK3Zz75hGHPEojFjHJYUFnJREQwffoonjN9uuICYvfIZLLq6mr2fXn06NFhw4axdeUXX3yhwnKdPXP0ySefsBVQnz59lPfWPPPMM5cuXVJhQ53OAZmYmLCTj4yMjL744ovm5mYVNlRVVbV69Wq2AjIyMlK+qBkzZqSmpqqwIYZhjh07NnjwYOWLYhs1Nzf/97//LRaLVdhQcXHxsmXL2Hw3MTFRvqiAgIDs7GwVNsRezejfvz/7SbZixYqeHK33BhYXKisr2Qslurq6ISEhD6dhbW1taGjo2LFj9fQMbG2lOjqMry8TGakIqce4fp1xclJklo2NCk5pJScn9+/f/4UXXsjKymK4wU5lZH8BjI2NuZuzzhY77GCTEDJ37lyVVHCPlJWV5e/vzzZkZWXF3Zx1tthRTiVnyzouGmLa3wwTJkxgG3J0dDx16hRHDTU1NbFj+Y8++qgnx0FgqZhcLg8JCWGH7gsWLFDW1VeuXHn11VeVKxDY2dlt23bj7y68JCYmxsbGdvxKRYUi2ghh9PSYkJCedpLrafqsH3/8cevWrffv3+e6obt3727duvXnn3/muiGGYfbu3bt169aqqiquG0pNTd22bdvRo0e5bkgmk+3YsWPbtm1NTU1ct3X37t1uT6Zj4eZnTpw6dSooKKi2ttbBwSEgIODy5cvsoks6Ojo+Pj4rV66cP38+O43rkTZt2sTn8z/44IOOX5RKyccfk2+/JQxDXnmFhIZKjYw0d7GN5OTk0tJSDw+Pq1evLliwgLuGysrKLl265OXllZycbGFhoZwzwQWhUGhoaGhhYXH79u1nn32Wu5t+CwoKkpKSpkyZkpCQ0KdPn/nz53PUEMMwJ0+eHDBggJ6eXkZGxtKlSzlatOvYsWMVFRWjRo3Kzs5+7bXXun0crIfFiYCAgMTExGHDhhUVFe3du/fmzZu2trbr16+/c+fO+fPnn3/++cekFXspja2iO35RT49s20aiokjfviQnJ2rqVPe8vDyiqQQCQWpqqpGRUWVlJacNGbVLS0tzdHRMTEzktK2BAwf+2U6/HXcNWVpatrW1lZWVzZ8/v7GxkbuGGIZxcnJKTEyMi4szNze/fv06F62UlJTU1tZKpdIpU6b0sEJCYHFl6NChFy9enDNnjqenZ2RkZGFhYUhIiPJ86uN5eHi4uro+8gN8/nxy9SojFn9948YNd3f36OhoopH69OljbW0tkUi4bohhmLFjxxYWFtrY2HC6PB6bIzKZ7MUXX3Rycrp06dJTeFFRUVFz587lriHSfnaiubl5yZIlKSkp7FVCLjg4ONTU1KSkpKSnp/fkOBgScqikpOT27dsDBw5U+U6oDQ0Nr7/++pEjR3g83ocffrhp0yaul999UteuXROJRBMnThQKhd7e3gMGDOCoobKysvj4eB8fn6SkJCsrK+UkLC6cP3/eyMjI2tr61q1bCxYs4G5NtMLCwqtXr86dO/f+/fvDhw/nqBVCiFwuP3XqlEAg4PF4NTU1M2fOJJoNgcWhtra2e/fuZWZmPvfccyo/OMMwO3fu/J//+R+pVPrss88eOHDAwsJC5a0AaBTN+ljWMn369Lly5Ypy4p9q8Xi8tWvXxsTE2NraRkdHT5w4sYfFNoDmQ2Bxa+DAgZwupuzp6ZmYmOjm5pabm3vx4kXuGgLQBBgSaoPW1taIiIg333xT3R0B4JbmTuSBLtq9e3djY+NHH32k7o4AcA5DQuq1trYWFBSIxWJ1dwSAc6iwqOfm5iaXyzmdxwigIXAOi3qKG6za77kH0HoYElIPaQW9BwILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsAKAGAgsAqIHAAgBqILAAgBoILACgBgILAKiBwAIAaiCwAIAaCCwAoAYCCwCogcACAGogsACAGggsACC0+P8A3nGR/Bj3xfwAAAAASUVORK5CYII=",
            "text/plain": [
              "<PIL.PngImagePlugin.PngImageFile image mode=RGB size=400x400>"
            ]
          },
          "execution_count": 29,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "test_A_mol = mol_df_A[\"main_compoent\"].iloc[0]\n",
        "test_B_mol = mol_df_B[\"main_component\"].iloc[1]\n",
        "test_C_mol = mol_df_C[\"main_component\"].iloc[0]\n",
        "\n",
        "test_A_smiles = mol_df_A[\"main_compoent_SMILES\"].iloc[0]\n",
        "test_B_smiles = mol_df_B[\"main_component_SMILES\"].iloc[1]\n",
        "test_C_smiles = mol_df_C[\"main_component_SMILES\"].iloc[0]\n",
        "assert test_A_smiles.startswith(\"*\")\n",
        "assert test_B_smiles.startswith(\"*\")\n",
        "assert test_C_smiles.startswith(\"*\")\n",
        "\n",
        "star_of_A_pos = core_num_atoms\n",
        "star_of_B_pos = core_num_atoms + test_A_mol.GetNumAtoms()\n",
        "star_of_C_pos = (\n",
        "    core_num_atoms\n",
        "    + test_A_mol.GetNumAtoms()\n",
        "    + test_B_mol.GetNumAtoms()\n",
        ")\n",
        "\n",
        "mol_to_combine = Chem.MolFromSmiles(\n",
        "    f\"{core_smiles}.{test_A_smiles}.{test_B_smiles}.{test_C_smiles} |m:{star_of_A_pos}:{core_A_pos},{star_of_B_pos}:{core_B_pos},{star_of_C_pos}:{core_C_pos}|\"\n",
        ")\n",
        "\n",
        "Draw.MolToImage(show_atom_number(mol_to_combine), size=(400, 400))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 417
        },
        "id": "ZoivhSbXzf1C",
        "outputId": "7e72cf60-cd07-4836-8b75-187405525335"
      },
      "outputs": [
        {
          "data": {
            "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAGQAZADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKqw6hbXE4hjdi5V2GY2AIVtrckYyDxigC1RRRQAUUUUAFFFFABRRRQAUUUjMqKWZgqqMkk4AFAC0Vm6X4h0fWrOO707Uba4glOEZHHzHJHQ89QfyrSoAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKwnm/s+8u49OsFmMEfmztJcsD8zM21MhsknJ7Dkc+m7XMeLpYNKtRqci3IhleO2vDBKEHlEkbm4PTcemDz1qKjajc6cJTVSqoNXb29fvX/DnR286XNtFPGcxyoHUn0IyKkrnvCusTX6X9jeJDHeadcGFkhXCmPrGwGTwR/KuhpwlzK6IxFF0ajg+n5br8AoooqjEKKK8xuPGth4e8Y3VlFb63rMlrEy3X9mWXnRW5eTeoY7yQQOMdOvcnAB6dRXBQ/GTwU0vk3eoXGnzf8APO8tJYz+e0j9a0p/FfhTxFp72dp4m0dmmwNrTozYz/dLA59+1AHV15v4t1C78a+IG8CaJO0VpGA+u30Z/wBVGekCn++3f0H4ipfEeuTeHbKHQPDcq33iHWZD9kHBEKbQrTPjPAC5JPVsnHUV0nhDwra+EdBj0+B2mndjLdXT/fuJm+87H/PGKAJ7fw3YWkNvBa74Le2SNIYU27Y1RgwAyM8lRk5ycCtiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAqpqmnw6tpV1p84/d3ETRk+mR1+o61brO1j+0vIT+zP8AW/OGyFIA2Ng89923+vFJq6syoScJKUd0eWeH/Ew0vxHpE1xnMs7eHtTY8BLiMfuWOeTuGFz7GvZK8B+I+m6hbeJ5tMsLFr2+8T2EMxtoJVWS3vIGGJvQDbuGeM/NzXb2nxJ1WwsoE8QeB/EkVwsaiaa0tRPEWA5OVPAJpQgoKyNcTiJ4ifPPc9HorgYfjL4KaQRXeoXFhMf+Wd5Zyxn89pH61a1T4p+E7HRJ9QtNYs9QmUYhs7aZWmmc8KoTrye+OKowF8d+KrvS0tdB0FVn8SaqSlpH1ECfxTP6Koz16kd8GtXwh4VtPCOgx6fA7TTuxlurp/v3Ezfedj7/AMqyPAnhe8snuvEviErL4k1TDTelrF/DAnoBxn1I74zXa0ARzQQ3EZjniSVD1V1DD8jWDe+AvCOo5N14a0p2PVhaorfmADXRUUAeUJ4e0n4b/FTR7rTLJLXSddgfT3GSwiuMh0IJJI34C46cV6vXLfETw/J4j8FXtra5F/ABdWTL95Zo/mXHucFf+BVe8IeII/FPhLTdZjwDcwgyKP4ZBw6/gwIoA26KKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigApkk0UTIskiIZG2oGYDccZwPU4B/Kn1zXjrW9O8P+GpL/UA8hVgtvbRth7mUghYxjnnPboMntQB0MM8VzCssEqSxt910YEH8RUleZ6XrPxH0rTYVk8B2dzEQX8u31UI6biWIPmZyRn1rat/HN4kCPqfhDXLNzncI4RME5xyVNTKSirs1o0Z1pctNXZ2VZev65beH9KkvbgFznZFEv3pXPRR7mqFp4zsNTIh06KeS7aRYhDPG0OCVZuSRwMIx4z0pIdMi1nxEuqXxfztMJgWzLBoopSA3mKcDOVZcZHH4VDnzL3Dpp4X2U74pNJatdX29E9r7FLwX4VuNOur3xFrkguPEGpnMr4wLeL+GFB2A4z6n1xmuxoorVHE3d3SsMlhinjMc0aSIequoI/I1zOreEdLFxbahpnh/Tf7RgkZ0mW1iBVtjbWJI7PtPHIrqaKBFbT/tX2CH7bg3O394Rjk/hx+VWaKKACiiigAryXTfEel/DDxlr2g63cNZ6RfTDUdNmMTsgL/6yP5QcAMOO3516vMhlgkjG3LKVG5dw5Hcdx7Vm6boqWBl3tHKskccZHlbchAevJz1P0GB0AoAo2Pj/wAIalgWviTS2Y9Fa5VGP4MQa34Z4biMSQSpKh6MjBh+YrGv/BXhfU8m98PaXMx/ja1Td/31jNc/P8GvBDyebb6XLZTf89LS7ljI/Ddj9KAOu1DVotN3ebDM4WJpiYwD8q4zxnPGfp+PFX+ozXlNlpA8FfEyy0u7vrzUtF1y0aK2OpSeeYbiM7tgZhwpHQdye+K9WoAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKAEYhVJOcAZ4Gar2uoW16xWB2YiNJeUZcq4JUjI5zg/lS315bafYT3l7MsFtDGXllY4CqByc15boN18QNcFxrvhmLRrDRbghLG01QSl5YkG1ZCVyVyOnPT8yAetUV58Ne+JticXfgzTNRA6tYakIv0kpT8S76z41bwH4mtsdXt7ZbhB9WU0Adze3ttp1jPe3kyQ20CGSWRzgKoGSTXnnhayuPHniJPG+rwvHplsSuhWUo6L3uGH95u3oMegNUnvpfi/raafDbXlp4R091kvvtEZie9mB+WLH90Yyf6cGvVY40ijWONFRFAVVUYAA6ACgB1FFFAFa9skvUjDPJHJE4kiljI3I2CMjII6EjkHgms/Sbq2juXtES6M0zSTSTzhf3jK/lnleM/IMAD7uK2aijtbeFy8UESOc5ZUAJycn9aVle5ftJcvJfQlooopkBRRRQAUUUUAFFFFABRRRQAUUUUAcb8TtFn1XwbNc2HGp6VIuo2TAciSL5sD6ruGPXFb/h7WoPEXh3T9Ytv9VdwLKBnO0kcr9Qcj8K0+teXeDdX07wP4j17wZql/bWVtFcfbdLa4lEatDLyY1JIHytnjqcn0oA9RoqC2vLW8Tfa3MM6f3onDD9KnoAKKp3uqWenEC7lMeQCCUYg5IXAIHXJHHWrlABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFAGdJrNtFfC0dJRJ5nl5wMZwnPXpmVB9T6c1o15Ze3Wr+PPF89v4S1GPSdM0lyLnV0tUmNxc4A2Ju6hQME579xitIaT8VLD/AI9/Eug6pj/n/sWgz/36oA9Borz8698TLD/j68G6ZqQHVrDUhF+QkGazNY8S+MvFMcfhuw8LanoE16dl1qdxho7eH+Moy8FiMgcj29QASajI/wAUvFD6Nbu3/CI6TMP7QmQ4F9OvIhU90Xqf/wBRr01ESKNY40VEUBVVRgADoAKoaDodh4b0S10nTYRFa26bVHdj3YnuSeSfetGgAooooAyv7KuP7VW7+2/uluGm8oIckGIJtzu6ZG7pWrRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFAEdwrvbSrHjeUIXLFRnHHI5H1HNc3/wAIZZajcNL4itrHVgIIooRc24kaHaDuwzZPzE5zwT36CuoooA4W5+D3ga4fzY9G+yTDpJaXEkRH4BsfpUH/AAq02ozpXjTxVZY+6hv/ADYx/wABYf1r0GigDy3TItX0b4ix+HPFGsyaxZ6pp7/YLmRPJbcjAyRHYQDleSecgD6V6l0rhvipptxL4Yj1zT1zqWgzrqMGP4lT/WKfYrkkd8Cut0rUrfWNJs9StG3W91Cs0Z9mGR+NAFyiiigAooooAKKKKACiiigAooooAKKKKAIri4htLaW4uJFjhiUu7t0UDqagudWsLO4W3ubuKKVmjVVc4JLlgg/EqwH0qLX7CXVPD2o2EJUS3Nu8SFiQAWUgZIrj7rS9OvfHMuirdPcN9mgmuIZppZWVF89T8xztP75CBkdz9ezDUKdSLcm9LvTsrf5slto7yG4huPM8mQP5bmN8dmHUVLXAw+EvEL+aIPGUtldxyuXEESSq4JyrujYw5GMjp6U99N+Jlkh+yeINC1Fh0+3WbQ5+vl0VcNSim4VU7dPev+VvxBN9ju68/wDHGuX+qapF4G8NTbNTu03392vIsLY9W/32BwB157ZBrN1rxj8RvDmm3c1/4Y0i6MVtJN51leMFRVAy5RhlgMgkDnFdD8N9At9K8NRal9q+36jrCre3l+eszONwA9FAOAPr0zXGUdBoOh2HhvRLXSdNhEVrbptUd2Pdie5J5J960aKKACs7VtIj1aNEkk2BUkTIQMcOpU9frn6gVo0UAIMgAE5PrS0UUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUANkRJY2jkUMjAqykZBB6ivPvhk76Jc654IuGJbR7kyWZY8tay/Mn1wSQfqBXcapZvqGmzWscqwtKNu8oW2jPOACOfQ54ODz0qrpOiRabNPcSGO4u5WINyYVWTy85CFhywByeT3oA1aKKKACiiigAooooAKKKKAGyuY4ndULlVJCjq3tVXS7uS/02G5mt5LeRwd0UikFcEjv9KwvHHi0+GNMiisoPtet37/Z9OsxyZJD3P8Asr1J/lmub0z4OWH2NbrVdV1VteuCZby+tLxoi8jHJwBxgZwOO34UAem0V59/wrrXLP8A5BPxE8QRY6C+KXYH/fQFL/ZXxSsf+PfxLoOqY6fbrFoc/wDfs0AbXjPxYvhqxt4baP7TrGoSfZ9PtVGS8h/iI/uLnJP+NP8AB3hf/hG9Nka6m+1atev59/dnrLIew/2RnAH+NYfgzw/fS+JtS8ReKri1ufESBbdIbbcYbGIrkKm7u2ck89evJrv62jXnGk6S2e/nbp6fqK2tyna6dDaXl1cxs5e4xuDEYGCx4/FmPOetXKKKxGVrmwtrxw1xHvxG8WCx2lXxuBGcHOB1rh/hfNJpS6z4LunLTaFdEW5Y8vayZeM+/Uj24Feg1514x/4pj4g+HvFqfLaXZ/sjUj22ucxOfo3U+mBQB6LRUMN1FPLNHGW3wttcMhXB69xyPccVNQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFAGTquoz2Gpaaoe2js5mdZ3mkVCDgbQuSM9+np+eqrK6hlYMp6EHINZmu+G9G8S2qW2s6fDewxtuRZAflOMZBHSuWX4P+E7aXzdMj1DTH55s76Veo64YkU4pN6gd6enXFZkt++j+H59Q1LzZPs8bzSBEDPtBJAwowSBgccd+lcyPAWr2n/IM8ea9H6C8ZLoD/AL6AqtbWOq6ZqFj57yzQXWsO0zSDaVkDOFkA7K644HAIXH3jXoQwlFu8aql5WaffqrfmRzPsW/BujX1/qEvjHxBGU1K8XbaWrdLK3/hUf7RByT79skV29FFcuIruvU52rdl0S6JFJWQUUUVgMiS2ijuJbhVPmyhQ7FicgZwOeg5PT1NS0UUAFFFFABXiHxR8SReKYtS0qK7e38OaUdt9dxKGa7vP+WdtEP4iGwT9O2Bn0Lxhrk8bLoWmTCK9uIzJPctwtpbj70hPY9QP/wBVYPhPwrZ61e2Gsy2zJpOmM39lQSf8tnON1zIO7Ejj0/AVl7Vc/IjueAmsN9Yk7eXl/wAHouyb7XpeGNB+KtlocFy2vaWbq4USy2mo2zMUJHAZ05JwBn+tbjaz8RtOiL3vhzRr8IuXazvzAPc/vOgrvKZLH5sLx7iu4EblxkfnkV10asacryipeTv+jTOBq5xCeNdbeS1kufDkunwhmedJJBKZYguWaJl6lfvdORwOTSDxte/2bqtwtt5/2ewa5gkgi3IDvnALHP3cRqePeuxg0+2ghtI1TcbSPy4XbllGAv8AIVx3gEx3NjrWhanbW/26wneyuQkQRZoCWaM7RwFIdsCvQhOhOlKp7JLltdK+zfnfyW/VIh3Ttc7pDuRSepGaWgDAwOlFeSaBRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAVyfxF0ifU/CUtxZcajprrf2jAciSPnA+oyPriusorahWlRqxqx3TuJq6sZ2g6vBr2g2Oq2/+ruoVkAz90kcr9Qcj8K0a888Au+geKfEXgud2McEv9o6duOc20p5Ueytx9Sa9DrObi5NxVl0GgoooqQCiiigArM1PVFh0vUJrSVGmtflc9RE2AST9AcmtOq9tYWto07W8CRm4cyS7R95j1JpNXRUJKMk2rpGDcabDrF1daNf3Ml9amCKZnyqOPnPykoACp25xjsfbHSRxpFGkcaKiIAqqowAB0Aplva29ohS2gihQnJWNAoJ9eKlpRjbXqa1qznaK+FdPPS7ttrYKKKKowCvPvEEp8LfE/RdaAC2Gtj+y71uyzDmFvqfu/QV6DXIfETTotf8MXuilJVuXh+0W0ybfklRl245BByQPfJAyeKuFSULqL3Vn6BY6+ivPI7/AOK2nIi3Wh+HtW2qAWs7t4Wb3PmDGfwxT/8AhYXiCz/5C3w512LHX7CyXf5bSM1AHoFFedz/ABj8NW8Di/h1jSpSpCreWDxndjgZwVB+vHrXWaHJqspmfUGjeBo4mgdCpDHb85+XtnB/HjigDYooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoIyMdKKKAMfStAgs3gu7sLdanFG8K3kvzyiNnLbN5GSBn2rYoooAKKKKACiiigAooooAKKKKACiiigAqJra3efz2giM2AvmFBuwDkDPpnmpaKACiiigCjrOlW2uaLe6VeKGt7uFon4zgEYyPcdR7iuW+FmqXF14UOkag3/ABM9DnbTrkHvs4RvoVxz3wa7evOr3/ilPjFaXo+TT/E8H2Wb0W7iH7sn/eX5R75oA9FooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAriPijZxap4TmtI5xBqluUvrCRgwCyxsMENgjOCR7bsnA5rt6qXWl2d67tcxGTfH5bAu2Cuc4xnHX86AOHj8d+LrKKMaz8OdTDhRvfT7iO5BPchVPH0z+NP/4XD4ct/wDkK2Wt6R6/b9NkXH127q9AooA4+D4oeC76FjaeJdPEm07BMxQk9vlbBP0HNbWi3+o3vmfb7P7NiONgNpHzEHcMnrjAPtnB5FVPE3gzRvE2i3tjc6faedPEyx3BhXfE5HDBsZyDg1Q+GWtzax4KtorwbdS01m0+9Q9Vli+Xn6jafxoA7CiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACvOof8AilPjJLD9zTvFNv5qei3cI+Ye25Dn3Jr0WuG+KOnyah4aSbTnT+3NMuIr+wj3AOzqwGACckEEjHc4FAHc0V55H8X9Kt4kbWtD8Q6O5Ub/ALXpzhVPfBGSR74H0rTsvip4G1DHk+JrFM/892MP/oYFAHYUVnQa7pd7bPNYalY3QVCwMVyrKcDuQTge9M0jWF1bftiVNkcTkCTcQXXO08DBH65FAGpRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAVRvdKhvp0mkklVkXauzbx8wY9R/sj2q9RQAVnX2gaNqeft+k2F1nr59sj5/MVo0UAef+JfhF4U1XR7xNP0a1sdSMTG2ntwYvLkAypwDjGcZ46VsfD3xA/iTwTp97PkXsam3u1b7yzR/K2fc4z+NdRXnWjf8AFK/FzVdGPy2HiKL+0rQdhcLxMo9yPmP0FAHotFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFcJ8VLC5/wCEftfEWnRs+paBdJewqgJLpkCROOxXk/7td3WRquhrqk7SGVY827Qf6vJ5YNnOe23j0J79KAOf074veBdRjRk8QW8DsASlyrRFT6EsAPyOK6ex17RtTx9g1axus9PIuEfP5GpL3SdO1FSt9p9rdKeonhV/5iubvvhX4F1DPneGbFM/88FMP/oBFAHXSv5cTvsZ9qk7V6n2GaoaZrdlqzyJaOWMcccjZxwHBIH145rgNb+ENjaaFet4W1HWdPvUhZreGK/cxOwGQjKScg9Oveuu8D65D4k8HaXq0aIsktuqTKqgbXTKsuB0AYNge9AHRUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAV514U/4pf4k6/4Xb5bPUv8Aib6eOwLHbMg/4EMgegr0WvPvipDJptnpXjK1XNzoF2ssgHV7eQhJU/EEfrQB6DRWRp3irw/q0aPYa3p9wHAIEdwhbn1Gcg+xrXoAKKbI6xxtI2dqgk4BJwPYdarWWpWepKzWc4lVQpJUHA3DI/Q0AW6KKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACsXWtJutTd1jn2QPbmJ42lO1yWU/MuCCMAg+oYjjrW1RQByl58MvBN+CJvDGmrn/njCIf/QMVlf8ACnvDlv8A8gq91zSPT7BqUi4+m7dXoFFAHl2t+C/F+h6HfXmg+O9YuZYIWkW1vlScygDJUORkEjOCO/pXYeC7nTdU8MWGs6bCkSXttGzKpJClRgryT0bcPrmuhrzrwH/xTXjDxF4Lf5bdZP7T00dvIlPzqPZX4/E0Aei0UUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAV558S0bQr7QfHECn/AIlNyIb3aPvWsp2tn12kgj616HWD4h0q41uObTnAfTrm2McyOqFdxdcHkE8LuP4DBBoA24pY54UlidZI3UMjochgehB7in15/wD8Ka8JwEnTBqelMf4rLUJVP/jxNH/Cvdfs/wDkE/EXXosfd+3LHd4/76AzQB6BRXnar4+0DULM6p4p0a+09pMzNPYPFIY1+Z9vl8BtoYjPHFegwzR3EEc0Tbo5FDo3qCMg0APooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigCtd6fbX2z7TGXCZKjcQASMZ4PXBPPbtU0MMdvBHBEoSKNQiKOgAGAKfRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQB/9k=",
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZAAAAGQCAIAAAAP3aGbAAA3EUlEQVR4nO3dB1hTV/8H8BM2socKDUuGVUAFcaEibq12OIqiAnVv29pW+7a11Tpea2vBvmIrW0XFhRZFVEStiAMHhSpokREgKCqRHQIhuf/nEv9IhbYCCclNvp+Hp0+MN+eetPbrueeee34siqIIAAATqMm7AwAArwuBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFgAwBgILABgDAQWADAGAgsAGAOBBQCMgcACAMZAYAEAYyCwAIAxEFjASJmZmRERETweLzQ0dNOmTWKxmKikDRs2fPTRRw0NDUQ1aMi7AwDt4ejomJiYaGZmtnjx4sDAQDU1VfyrNyEhYdOmTSwWa+DAgdXV1Z6enlwuV11dfeLEiURJqeJ/ZlAC2trakhe3bt3y9PQkqqegoCAgIEAsFm/cuHHq1KmLFi2Kj4+fMGHC/fv3ifJCYAEjZWVlFRcX379/v6CgQAUDSyAQTJ8+ncfjvfPOO1988UWXLl1CQ0MDAgI0NJT8mknJvx4oK2dn5+3btxNCevfuTVTP8uXL79y54+jouG/fPhaLFRcXJxKJSkpKCgsL+Y26dOlClBGLoih59wGgwxoaiLIPLpr88ssvy5cv19XVvXr1qru7O1EluCQEhktOJkOHkokTycCB5KefiLK7efPm6tWrJbGlammFS0JguPJyMn8+uXSJWFsTgYBMmEBcXcmYMURJ8Xi8mTNn1tXVffjhhx988AFRPRhhAZNdvUqGDaPTihCio0MWLSInTxIlJRaLZ8+ezeFwhgwZ8sMPPxCVhMACJuPxiInJy1+amtLvKKkvv/wyMTGxe/fux44d09LSIioJgQVMZm9PsrNf/vLPP+l35CE3Nzc5OVkkEu3fvz89Pb2tH6+qqgoODg4KCsrNzQ0KCjpw4MArB5w8efL777/X0NA4fPgwm80mqgqBBUw2dCg9pNq3j9TUkBs3SEgIfVUoD2pqardv3z579mzv3r1Ptv2y1MDAYOXKldXV1bGxsUuWLHn48GHz33348GFAQABFUdu2bfP29iYqDIEFTKamRhITyYMHZMYMEhVFTpwg3buT48eJSNTJHenRoweLxRo2bNiNGzcePHjQjhaOHj06fvx4AwODqqqq5ouNampqpk6dWlFRMWXKFMn9QVWGu4TAcEZG5L//pV9UVxN9feLtTS90iIsj777bmb1IS0vjcDhlZWU6OjqTJk1q68eLiopyc3MrKysDAgJiYmImT57c9FvLli3LzMx888039+7dy2KxiGrDwlFgvvJy4utL7t0j+flk507y6adk0iRy+rRc+iIWizMyMlxdXTU1NTveWlBQ0CeffKKvr5+amurs7CyNDjIbAguUQp8+dGAdO0YvwmKz6TVZubnEzq6Te3H//v3hw4c/f/5cQ0PDxsbG/q+cnZ11dXVfv7Xr16+PHDlSKBQePnzYx8dHlh1nDAQWKIWdO8mHH5Jx4+gprYAAEh1NvvqKbN7cmV0QiUReXl7Xr1//uwM0NDRsbW0dHR0dHBwc/5+9vX3TzhPNPXnypH///o8ePVq7du22bdtk3HfGQGCBUqiooAdWfD69suHpUzJ8OLGwIIWFRBrXZa/p888///7777t165aYmNi7d28ul5vXKDMzMysrKy8vr6CgQNTa3QATExPJ+MvFxaVpODZlypTk5ORRo0YlJiYq/R4Mrw+BBcpiwQISGUnWriXbtpG+fcndu/QV4vTpnXPyuLi4qVOnqqurJyUl/d3Kg7q6ury8vJxmcnNzCwoKWt0vlMViWVhYpKend+vWTfbdZwwEFiiLmzfJ4MHE3JxwuSQsjKxa9eIKUfays7MHDRpUUVERGBjY1pUHQqGwqKgo76/u3r1LCPH39w8PD5dZrxkJgQVKxMODpKWRmBjy1lsvrxCdnGR6zurq6iFDhmRmZs6cOfPQoUNSafPixYtjxoyxsrLKz8/H9WBzWDgKSmTxYvqfISH04ixfX0JRZTExsj7nggULJOukQkNDpdXm6NGjnZ2duVxuQkKCtNpUDggsUCJz5hBDQ/LbbyQrq3zZsqmOjm8GB9fV1cnuhIGBgUeOHNHX1z9+/LihoaEUW16wYEFj9oZIsU0lgMACJaKvX7p8edTw4Rv27DH28Cg0NHz27Nnx48dldLZr16598cUXLBYrKipK6qs658+f36VLl7Nnz3I4HOm2zGgILFAqxb6+81NSfgoL4/P5ixuvEGU0SCkpKfHx8amvr1+7du37778v9faNjY2nT58uFosx7/4XFIByGTRoECFk3759VVVVksu0zMxM6Z6ivr7ey8uLEDJq1CihUEjJRkpKCiHEwsKivr5eRqdgHIywQNksWbJEMrDS19efNWsWIUTqg5TPPvvsypUrlpaWBw4ckN1dvGHDhvXp06ekpKQd+9UoKyxrAGVTW1vLZrPLysp+//13Fovl5uZmbGxcXFwsrcpXhw4dmjVrlqam5qVLl4YNG0ZkKTg4eNWqVePGjUvslAVlig8jLGC869evSyaqwsLCYmNjdXV158yZQwiJiIjo16/foEGDysvLjx07JpVz3bt3b+HChYSQn376SdZpJVk7qqenl5SU9MqWfioLgQWM5+npyefzMzIyMjMzjYyMJHVGCSHR0dE1NTVNV4gdP1FVVdWMGTNqamrmzJmzbNkyIntGRkYzZ86kKApT7xIILFASQqFwxIgR169fF4vFvXv3HjZsWEVFxYwZM1avXs1isa5duzZ37txffvnl/Pnz+fn5rT6E/M8oipo3b979+/f79u0rxTWi/0oSuJGRkTJdUMYUmMMCxsvKyjpz5oyPj09cXJy5ublkoj06OjogIIDFYqmpqbWMJ01NTWtr6zbtV/Xf//73q6++MjExuXXrloODA+lEHh4eaWlpMTExvr6+RLUhsEA5FRQUODk5CYXCDz/8cO7cuVwuV7LHiwSHwxGLxS0/ZWlp2XyPF3t7+169eunp6cXFxUlWRR0/fnzKlCmd/F1CQkKWLl06cuTIS5cuEdWGwAIlJBQKx4wZc+XKlb/bT6q2trb5Hi+SF0VFRS1TTE1NzdjYuKysjKKodevWbdq0iXS66upqNptdWVmZmZmp4hslI7BACX300Uf/+9//rKys7ty58/r7SbW600tmZqZAICCEyHfvhKVLl4aEhKxevTowMJCoMAQWKBvprpMSCoWpqanPnz9/++231dTkdpMqIyND6gvKmAh3CUGpSH2dlKamZq9evVJTUwUCQWxsbEhIiFyeRm5aUBYbG0tUGAILlEhFxSfLltXU1AQEBEhxnZS5ubmDg0NDQ8P9+/dLS0tra2uJPCyR3oIy5kJggbKgKPLBBzFZWR9On757925ZnEEsFo8fP/7OnTtEHmbNmmViYnL16tX09HSiqhBYoCy2biVxcWYU9dO2bW0q//evampqSkpKrly54ubmlp6ePmrUKCIPuv//yNGqVask9wE6IiEhISgo6NmzZ9HR0WFhYYQhMOkOnUEkEqmrq0vmsMVicauV+DrkwgUyYQI9yDp1ii77rKROnjz53nvvEULU1dVtbW1fWfjau3fv15+PFwgEGRkZ5eXlrq6uu3bt2rRpk+Q/kILD/vYgc7W1td98842fn1+/fv0CAwMNDAwkz/pJTVERmTWLiERk/XolTitCyLvvvvvll19u27ZNLBZLVl00/111dXUbG5vmVVolr3V0dFo2VVdXl5ycvHr16traWi0tLT6fb2BgQBQeAgtkTldX96233iKEJCYmSgrMSLP1ujq6+OCzZ3RRr6+/JsooNjb24cOHfn5+p06dMjAwaGhoqK+vbyrU2rReLDs7O79RUlJSy0KtzWu12trahoaGGhsbZ2dnX7161czMTE9PjzABLgmhM5w8edLc3DwvL6+ysvLBgwf/+9//pNb04sV0FUJbW3L7Nl2UUEldvnyZxWKNGDHiu++++/TTTzVbq2hdX1/P4XCaF2rNycnhcDhCobDlwU5OTmfOnOnkhyI7DoEFneHChQtdunTx9PQUi8WPHj2ysrKSTru//kqmTiW6uiQlhfTvT5RUUVHRr7/+unLlyqtXr/J4PMk01ut79OhR8+coc3Nz09LSCCGTJ0+Oj48njILAgs5TVFS0Zs2a4uJiT0/PpnkWKyur9q8gr68nq1eTAQPIvHlEee3evdvMzMzV1fX06dO2traTJ0/u4GL3sLCwxYsXW1tb5+fnM2KuvQkCCzoJRVEuLi73799/5X1tbW17e3snJydHR0efvn2HWFoSR0diY0NafWpPKCTff0/fEySEjB9PPvus9cPgH1EU1atXr+zs7Pj4+MmTJxPmQGBBJ9myZcu6det0dHTWrl2rpaXVNMlSUlLSdEzyyJFev/1Gv9LUJHZ2dHI1/+nRg55WFwjI9u30MatWka5dycaNcvxSzLV9+/Y1a9a88847zKpwgcCCznDhwoUJEyZQFHXq1KlJf115UF1d3bTNy4yKih7Xr5OcHFJcTC+qesWVK2TGDPp3JRdElZXE1ZUUFnbmF1EaPB7Pysqqvr4+Ly/P1taWMASG0yBzRUVFs2bNEolE69evfyWtGqs167s1+su7AgEdTJKf3NwXL2xtiVj8Iq0IoavS19bS78hvEwXmMjMzmzZt2sGDByMjI7/99lvCEBhhgWzV1dV5eXndunVr3LhxZ86c6egUr6Ulyc8nkpWQNTWkVy961Si0S3Jysre3t6WlZUFBQavrJBQQ/moC2Vq1atWtW7dsbW0PHjwohRtSM2fSzwxKHnXevJnMni2VTqqmESNGuLi4PH78+PTp04QhEFggQ5IHa3V0dGJjY82lsqpz61ZSVUW8vekfkYgw51pGMS1atIhZW9bgkhBkJSMjw9PTs7a2NiIiYv78+fLuDrSivLyczWbX1tZmZ2c7OjoShYcRFshEWVnZtGnTamtrly5dirRSWMbGxj4+PhRFRUZGEiZAYIH0icXiOXPm5OXlDRo0aMeOHbI4BYfDWbp0KSHk22+/DQ0NbbVmF7z+Rqbh4eGMKNSKwFJFt27d2rp1a21t7c6dO7/77jupt79hw4YzZ86YmpoePnxY+ltfNbKzs+vVqxchxMDAoK6urr6+XhZnUQWenp7u7u7Pnj1jxApSBJYqGjhwoI6Ojq6u7qpVq1p9lL8j4uPjt2zZoqamdvDgQTs7OyJjs2bN6tq1a25urqxPpMQWNpbtYMTUOwJLpcXGxo4dO1aKDXI4nLlz54rF4i1btkyYMIHIzJMnT7p06ZKSknLv3j0DAwMXFxfZnUvp+fv7GxgYXLx4MTs7myg2rHRXRdnZ2QYGBmlpaaWlpWKxmKIoFovV8WYFAsH06dN5PN677777+eefE1nq3r374sWLZXoK1WFgYDBz5szw8PCwsLAffviBKDAsa1Bp165d++GHH9zd3b29vR0dHdlsdkdamzdv3p49e5ycnG7dumVkZCS9boLMpaenu7u7m5mZcbncVrdUVhAILNV1+vTpd955p/kfAC0tLSsrq5bb6b7OCvVdu3atXLlST0/vxo0brq6uMu47SN+AAQPu3Lmzf/9+SW0exYTAUlE8Hm/AgAEcDsfExMTb27ukpCQnJ6e0tLTlkZL9qpr225OUNrC1tdVothFVamqqt7d3XV3d3r17AwICOvergHRIdvXz8vJKTk4migqBpYrEYvHbb7995syZQYMGJScnN608qKioaL4deG5u7sOHD5vvV9VEU1PTzs5OEl6VlZXHjh3j8/kff/xxUFBQp38bkI7q6mo2m11ZWXnv3j2FvYmBwFJFX3/99ebNm83MzG7fvv2vKw/q6uqKi4sldVmatgYvKCgQiUTND+vWrRuXy2XKQ//QquXLl//yyy/Dhg1LTEzs4C7Mp06dys7OnjdvXnR0tLm5ubQuMxFYKic+Pl5SxeDMmTPjx49vXyMCgSA3N1cyENu/f3/v3r2//vrr3r17S7uz0KmOHz8+ffp0yWtLS0vJJGaTXr16vX41sLq6ups3b4pEoqNHj06bNm3MmDFS6SECS7VwOJwBAwbweLzvvvtOWisPRCLRzZs3PT0909PTRSKRh4eHVJoFufjiiy+2b98ubvTKb6mpqVlZWTXVZ2160WqKPX/+PCoqavXq1RRF/fjjj8uWLZNKoVasw1IhtbW1Teuk1q5dK61m8/PzT506NWTIELFYfPz48b59++LCkFnCw8MfPXr0xRdflJWV1dfXCxsVFRU1L9QqmRMobHTx4sVWC7U23V92dHQMDQ3t1q1bdnZ2cnKyiYmJtMpKI7BUyPLly9PS0pycnPbt2yeVlaISjo6OXbt2ZbFYpqamlZWVGLMzzsKFC3ft2lVVVRUREWFpaSm5qSJJn+aHNTQ0cDicpqkAifz8/LKysjuNmh/s6Oh49uxZBwcHySOf0oLAUhXBwcF79uzR09M7ceKEdFd1cjic69evDx8+PCMjQ1tbu76+XktLS4rtg6zdvn27a9euzxoVN2p1CbGGhobkSvCVh67Kysqa35ORFGrNycn59NNPf/31V+l2FXNYKuHGjRve3t719fX79u3z9/eX0Vkkf5akOHaDTtDQ0LBjxw5bW9tx48YZGxunpKQMHz68g22GhIQsXbrUxsYmLy9PyoVaKZC3kpKSzZs3UxR19uzZ4ODgsrIy6bb/5MkTSWl4yQwogKyJxWInJydCSEJCgnRbxm4N8mdmZia5z3L9+vX333//0KFDUmxcJBL5+flxuVxPT09ZbH0F0BKLxVqwYIEstqxBYMlf0zMuLBZLQ0OjoaFBio3/5z//OX/+fPfu3Y8dO4apJeg0CxYs0NbWjo+PL5RqpVsElvxVVFTw+fybN2/269dv//79M2fOlFbLcXFxP/74o4aGxpEjR9544w1pNQvwr8zNzadOnSoSiaKiooj0YNJdgYjF4iNHjtTW1r711lsWFhYdbC07O3vgwIGVlZU7duz46KOPpNRHgNf122+/jRo1ysrKKj8/v/mj8h2BwFIgb775ZtOWj9ra2mw2u/k2L/b29nZ2dmqvV5a9urp68ODBWVlZvr6+MTExMu44QOtcXFyysrLi4uLeffddIg0ILEUh2U+KxWJZWloKBILnz5+3PEZHR6f5Hi+SF9bW1q/cOaYoytfX98iRI7169bp586a0FhkDtFVgYOCnn346adIkaRWXRmAp1jqp8PBwye2V58+fN9/jRfLi6dOnLT+rpaXVo0cPR0dHJycnR0fHbt267d279/Tp0wYGBqmpqXggGeReqFXyqLxUKpIgsOTv6dOnHh4eXC539erVgYGB/3CkQCB49OhR821e8vLyOBxOq1X5FHzrSFARAQEB0dHRX3311ebNmzveGgJLzkQi0VtvvXX+/HlPT8/ffvutHSsP+Hx+8y33Ll++XFpa6uvrGxwcLJsuA7TB1atXhw8fbmFhUVhY2PGn4hFYcrZmzZrt27d37949LS1NWisP6uvrGxoaunTpUl5ebmxsLJU2Adqtb9++d+/ePXbsWNNmW+2GdVjyJKN1UhEREadOnbpz586mTZuk1SZAu0kKskll1TsCS26ys7MDAgIoitq+ffuIESOk2PKUKVMIIR4eHpJHCAHky9/fX09PLykp6eHDhx1sCoElH9XV1VOnTq2srPT19ZX6qs6HDx/m5+c/ffqUy+W2WggHoDMZGRnNnDmToqjw8PAONoU5LDmQ9TqprKys2tpaa2vroqIiW1tbc3Nz6bYP0FY3b94cPHiwubk5l8ttqtLUDggsOdi+ffuaNWuwTgpUioeHR1paWkxMjK+vb7sbwSVhZ7t27dqXX37JYrGioqKQVqA6Fktj6h2BRSoqSHT0y1+GhxOBgCQkkKSkF+/cuEHS06VzrpKSkvfff18oFP7nP//p+C1eAAaZM2eOoaHhb7/9lpWV1e5GEFikrIz88svLXwYFkdpacvQo8fMjT57Q71y6RGdWxwmFwhkzZjx+/Hj06NFYcACqRl9ff9asWZIiPe1uBIH1t/z8yKefSrPBTz755MqVK9bW1ocOHZLyRtcAiiopKenChQuEkHPnznl4eBgZGb1+NdaWUDWH9vAhmT37xevi4hcv3nuPbNlCGv9V07ZtI1VVxMGBODoSJyfS1u2qYmJigoODNTU1Y2JiunbtKs3eAyiqhoYGGxub/fv3a2pqPnv2bPbs2X5+frq6uu1uEIFFs7Mj27e/eN28utrOnWTGDDJtGv06Kor8+efL39LTo5NL8uPgQHr1utejhzGbzW61Zszdu3cXLVrU2ODOYcOGyfz7ACgGDQ0NCwsLfX19DocjEok2bdrUwcICCCyapiZpejCm+c6ITk7knXfoqFq7lqxfTx48IDk59E9uLuHxSEYG/SPh6Oibk5OppaVlZWXVvASui4uLqanpjBkzampq/Pz8lixZIo/vByAflZWV/v7+EydO9PDwOHHiRMf30UVg/YsvviAHDtAvGqcLXyorexleOTmEy32jqqr0yZMnki1fmh8pqSvRr1+/0NDQTu48gHwZGhrGxcVJXru4uHS8QSwcJUIhPW/VtLnY4cPE0pL07k0MDIiODv0Oj0cPwQwN/72purq64uLi5lVw8/Ly8vPz1dXVg4KCVq5cKeOvAqDkEFh/UVxMPDzI8+f0UoYOzjUJBIIjR47Mnj07ODh49erVY8eOPX/+vNQ6CqCSsKzhJaGQ+PrSa6+8vMiQIS/fl+xcfO7cuTa1Vl9f//Tp07q6unnz5unp6V24cKGpwAQAtA8C66XVq0lKCrG2JocOkebLpM6cOdO/f//r16+3qTXDRpJH1X19faXyqDqAikNgvXDwINm1i2hrk9hY8soyKT6fr6+vr66u3urW6X+nqqoqIyPj0qVLhBDJzcHIyEiBQCD9rgOoDMxh0e7epa8B+XwSEkIan9D8i7KysvDw8K5du86dO7fdpxgwYMCdO3cOHDgwu2mJKgAoa2AVFxfX1NT07NlT6i2Xl5OBA+mlCX5+f3kKWrpCQ0OXLFkyYsSIy5cvy+ocAMqOGZeEeXl5Bw8elMXzdxRFffllXk4O6d+fyHSZ1OzZsw0NDZOTkzMzM2V4GgClxozASktLE4vF+/fvl3rLW7Zs2b3b6a23zsXGkg484fTv9PX1JReDmHoHUPLAcnV1lWzPIt1mL1y4sGHDBhaLrFollkZV2n+xbNkyQsiePXv4fL7MTwagjBgzh8Xj8QwNDTteiLFJYWGhh4dHaWnpt99++80335BOMWTIkNTU1D179nzwwQedc0YAZcKYwCKELFy48PDhw+bm5kOHDnVsph27tQgEAi8vr9u3b0+ePPnkyZNqap000oyKipo/f76np+e1a9c654wAyoQxgRUdHR0QEMBitdJhHR0de3t7FxeXpm0S7O3t7ezs/iGGFi5cGBERYWtre+fOHTMzM9JZamtr2Wx2WVnZ77//7ubm1mnnBVAOzAisP/74w9PTk8/nr1+/fsiQIY8fP87Nzc35fxUVFS0/oqen5+Dg0DQKk7y2srJSU1MLCwtbvHixjo7O1atX+/fv38nf5cMPP9y5c+eKFSuCg4M7+dQATMeAwCovLx8wYEBubq6/v/++fftaHlBaWprTAo/Ha3mktra2vr5+eXm5SCTau3dvQEAA6XT37993cXHR19cvLi6WekVCAOWm6IFFUdT06dNPnDjRr1+/69evv/7mquXl5bm5uXl/lZ+fT1EUi8WaPn360aNHiZx4eXmlpKSEhYUtXLhQXn0AYCJFD6yNGzeuX7/exMTk9u3b9vb2HWytqqoqKSlJT09v/PjxRH7279/v7+8/YMCAW7duybEbAIyj0IGVlJQ0ceJEiqLi4+PfeustKba8ceNGZ2fn999/n8hDXV2dtbX1s2fPbt++Ldk69u7du76+vrJ48AhAmSjuwtGCAuH8+atEItH69eulm1aEkEePHunr6xM50dbW9vf3b6qCO3Xq1KFDh5aWlsqrPwBMoaAjLIGADB9OKirqhg7dGhX1jdTXSYlEoi1btqxbt67TVmC1OvWup6dXXFzM4XDS09PlcgcAgFkUdIS1ciVdbquhQTswcIPUM6W2tjYiIqJXr17ySitCSHBwMEVR1dXVn3zySVpamkAguH//vrw6A6ASI6wnjfr27dvWD5aWlh4/fpwQsmjRopaF/EJDyZIldAGIq1fpTRSUz4EDB/z8/NTV1UUiESFEUhxMUhPsNRe+Sjx9+qK2KyGkooL+sbAg2dmk8clL2h9/EGfnvxQuA2A2qgM+++yz3bt3Z2dnt+/jP/74Y8s3f/+d0tWlCKGioiillJGR0aVLF0LIrl273n77bY2/iRNdXV1Pz4lTp1Jr1lC7d1NJSRSHQ4lEf2kqKIgyMKCKiujXCQnUkiVUYSE1aNDLA/r0oZ496/RvCCAzHfrLV0NDIyMj47333mvHZx8+fNjyptjz53SZ5dpasmIF6cDunoqrrKxs2rRpfD7f399/+fLlFRUV8fHxRkZGe/fura+vf2XJWEVFt1f2kdfWJj16vCg33acP/c6YMeSjj+htnQFUQYcCi8/n9+zZk8vltqOga1JS0itlkMViMmcOyc8ngweTH38kyoeiqPnz5+fm5rq5uYWEhFy8ePHrr79msVhRUVEtQ7+ioiIvryI7+0WhVsnP48d09ekHD+gD3N1JQAAZPZquSBYf/7JqxtOn9EbPEs+fd+43BFDkwPrhhx/q6ura93yJZHOo5r77jpw9S7p1I8eO0UMJ5bNx48Zff/3V1NT0+PHjpaWlvr6+IpHom2++mTp1asuDjYyM3N2N3N3/8mZNzcty08bG9Cb0hJCgIDJ5Mtm06cUx6uova77KYItWAMYGVkZGRnBwcPOni01NTdvd2sKF9GBhzRpiZUWUT1Ji4saNG9XU1A4ePGhlZTVq1Khnz56NHTv29bfi4vP5hw8fMjY2nj59muSdHTvof9ra0iPToCB6fp0QYmZGZs168ZGtW2XzZQCYGFhpaWmvPI1samraFF4uLl7W1uMcHelBU6vWrKHHAl9/Tb8ODqbnZZS2NHJBwagVK9a4uelPnTphwoSlS5devXrVxsYmJibm9Teq19LSGjFiRExMzLRpLwKryaefyrB8BoCSBNaYMWPCwsIkuyNI9nt5/vz5zUaEkOHDF6akjGvcr4q88Qb997+LC7G3f/FjZ0df16Snk3ffJf360TXiOzA4U2wCAZk2TT0nZ+u771Jffrl///6QkBAdHZ3Y2Fhzc/PXb0ZDQ8PQ0LCurq7pnTFjiGTlg5YWOXyYPHpETEzI2rUvP7JuHZHfen4ABQssycVg83dKSkqaNngRCAYJBPSES3k5ycujf+LjXx4p2adgwwaydCm93kqZrVhB0tKInR0rMpL1xx8jQ0PZBgbfBgUNGDCgTc08f/48Nja2+RPgkhuFhJDKSjqn0tNJQQGZPv3lR2bMkNZ3AFAIUl5TaNFo+PDhzd8sK3sRWM1/7O0Jj0fc3OiloWFhRGmFhpLISHqQGRtLD4emT7fKy/tz9Wq9BQva2pKpqWnLOxUShobkyRNSUkJOnCAzZ0qj2wCKiZITkYiaOpXKyKCeP6ecnelFjwcOUMqm+SpYkYiaNIl+7eZG8flSP9XPP9Ntjxol9YYBFIjcHqZreuzExIR8/jlpbSdRhmtaBbtyJb0KduNGkpBAT9QdPy6LCoj+/sTAgL7NmpUl9bYBFIU8H37u3p2eLZb8zzZunHJNDzetgh0yhF4Fe/482byZDukDB+i16jKgr/9iNUNEhCyaB1AICrG9THU1iYkhFEUWLyZK4vJl+h6emRm96YRIRDw86Bm7LVvIl1/K7pwZGfScoJkZ4XLpSTMA5aMQgXX1Kr37lYUFKSwk0quUKm9JSfSXGTyY/m537pB33iFxcaTF1hTSNXAguX2bXpPl5yfT8wCo8H5Yw4bRd+hLSuj/o5VBQwP9z7Fjibc3+eEHOq0cHelZOhmnFSH0tjyNG5nK+jwAKhxYhLy4GGT8/2n79pFBg8iECfTUlWTV2dq19KR7bCz97J/szZpFjIzIw4fi+/efdMLpAFTxklCy/xybTT/N++efxMmJMFJ6OlmwgJ690tcnz57R48aLFzv/wcj1609s3eq3ZMn8nTt3dvKpAVRlhGVkRK94pCgSHk6YKj6evt8pudnZtSv9zJE8no18/31HoZAfHR1dU1PT+WcHUInAapp/iYwkzZ6WYxQej15U1sTMjH6n0/Xp02fo0KEVFRVHjhzp/LMDqEpgDRpEP6ZTWkqvrGSKw4cP8yW7UhFCP22Unf3y9x48oN+RB8nOiJIaYgDKRIECi3FT76mpqUlJSdXV1S9+7edH7z146RI9FXfiBP3A89tvy6VjPj4+pqamqampaWlpcukAgEoE1pw5ZOTI4+Xl3opf86qhoWH37t1du3YtLi5+8ZaJCTl3js4sHx+SkkLPuEsW8nc6XV1dSZXDMGV+rBxUkaLcJWyydOnSkJCQjz/+OCgoiCgwsVjM4XAuXrw4ZsyYHrJ52qYjHjx44OzsLCnUati0ZTIAwylcYGVkZLi5uRkbGxcXF0vKYUH7jBw58vLly+vWrdvUtN97u1AUFR0dXVFRMX/+/AMHDpibm7fc8hRAFS8JCSH9+vUbNGhQeXn5sWPH5N0XZvP399fW1t68ebO2traDg8O4ceOWLFny008/JSUl5eXlSWq4vg4Wi+Xj4yMSibhcbmFhYU5Ojow7DvC3FLEo8JIlS27evBkSEiKZiIH28fX1PXny5OnTp5sqHjb/XW1tbXt7eycnp6Y9+B0dHW1sbFot7Mrlcg0MDCorK/v06XP79u1O/BIAin1JSAipra1ls9llZWW///67m5ubvLvDMFFRUY8fP/bz80tNTf3zzz/XrVsnEAhyc3OzsrJeKdTa8j+9pqamtbW1/V9ZW1sHBgZ6eHj079//5MmTRkZG8+fPl9OXA1WniIFFCFm1alVwcPCKFSuCg4Pl3ReGoSgqPDz87bffNjU1jYiIWL58eauHVVdXN68eIlFcXNzqn4cxY8bEx8frYM8akDcFDaz79++7uLgYGhoWFxfr6enJuztMcvfu3fT0dH9//wMHDowePdrS0vL1P1tfX8/lcptGYZmZmenp6VwulxCyZcuWL2W5mRcAU+ewCCG9e/ceNmxYSkrK8ePH/f395d0dxqAo6tChQ7a2tk+fPu3WrVub0kpS+lByGdj8zWXLlu3evfuqkpc2AmZQrBGWQCDYunUrRVEbNmxITk7m8/kTJ05Ua9r+HeShvLyczWZLJsLs7Ozk3R1QaYqVBXw+v66urqys7NGjRyNHjpw0aRLSSu6MjY2nT58uFovDGbyTBigJxYoDU1PTNWvWGBgYtKkkMnTO09QRERFCoVDefQGVpliBRQg5d+7cnDlzcENKoQwbNqxPnz4lJSVxSrKJNTCVwgXW7NmzXVxc5N0LeNXixp00sGUNyJdiTbqDwqqoqGCz2Xw+/8GDBz179pR3d0BFKdwICxSTkZHRzJkzJatS5d0XUF0YYcHrunnz5uDBg83Nzblcrra2try7A6oIIyx4XYMGDerfv39paelxBm1iDcoFgQVtgKl3kC9cEkIbVFdXs9nsysrKe/fu4WYudD6MsKAN9PX1Z8+eTQjB1DvIBUZY0DZ//PFHv379sIc1yAVGWNA2ffv2HTx4MPawBrlAYEGboVAryAsuCaHNsIc1yAtGWNBmurq6fn5+mHqHzocRFrR/D2t9ff3i4mIDAwN5dwdUBUZY0B69e/ceOnRoVVXVqlWrOvh3nkgk2rt3b0REBJ/PDw4Ojo2NlV43QdkgsKCdJLNXe/fu1dHRaSrUum3btlOnTrWpUKu6uvro0aNzc3MTEhK8vb0zMzMx6geGFaEAxbdt27aioqK/K9SqpaXVo0cPx7+ytbXV1NRs2ZSRkZGamppQKGQ1oiiKxWJ14lcBxsAcFrQBRVGhoaE8Hm/hwoV79+7t1q3bBx98UFdXV1xcnJmZ2bxWK4fDEYvFr3xcQ0PDxsbmlUKtNjY2J06cEAqF8+bNi4yM7N69u4+Pj5y+Hyg6BBa0WWBgYK9evcRi8ZkzZ3bt2tXqMXw+/5Uqrbm5uUVFRS1TTFKoNSEhQUtLS/Z9B2bDJSG0TWpqqo2NzaRJky5fvvwPNY26dOnSt9E/FGrNy8tLS0vLzc29cOHCjh071q5d2ynfABgMgQVtUFdXd+XKFVtb25KSkuzs7LbWuG21UOvSpUtDQkJSUlIQWPCvcEkIcsbj8aysrCQz97a2tvLuDig0LGsAOTMzM5s2bZpYLI6MjJR3X0DRYYQF8pecnOzt7W1paVlQUNDqugcACYywQP5GjBjh4uLy+PHj+Ph4efcFFBoCCxTCokWLsGUN/CtcEoJCKC8vZ7PZtbW12dnZjo6O8u4OKCiMsEAhGBsbz5gxg6KoiIgIefcFFBdGWKAobty44enp2bVr16KiIhRqhVZhhAWKYsiQIe7u7s+ePYuLi5N3X0BBIbBAgWDqHf4ZLglBgVRVVbHZ7Orq6gcPHvTs2VPe3QGFgxEWKBADAwNfX1/JJjby7gsoIoywQLGkp6e7u7ubmZlxuVwdHR15dwcUC0ZYoFjc3NwGDBjA4/FQqBVaQmCBwkGhVvg7uCQEhcPn89lsdnl5+d27d11dXeXdHVAgGGGBwunSpcvs2bMJIWFhYfLuCygWjLBAEf3xxx/9+vUzMjJ69OhRly5d5N0dUBQYYYEi6tu375AhQyoqKlasWPH6JQ7/zsGDB3/55ZfS0tKQkJCYmBgp9RHkAIEFCqpfv36EkD179ujq6jYv1Hr06NHMzMyGhobXb2rkyJGSRRILFix4+PChLHsNsoUiFKCgtm/f/vjx44SEBKFQ2LJQq6ampp2dnYODg6REq5OTk4ODQ48ePVqtFWZoaGhkZFRdXX316lUvL69O/BIgZQgsUCxHjx4tKCjw8fGJjIwMCQmxsLCQFGrNy8trXqu1oKDgYaNXPm5iYuLs7Ozi4tJUqNXOzu7kyZO6urrl5eUXLlzAbUdGw6Q7KJaysrKtW7euWbPm2rVrffv27dGjR6uHCQSCV6q05uTkFBYWtjrhNWbMmLNnz2po4K9nxkNggWIRCAQJCQnW1taPHj36h8BqlVAoLCoqal6o9d69e/fv3yeEBAUFffzxx7LsOHQG/J0DiuXcuXNPnz4dP358enr6gwcP2hRYmpqaLQu1Ll68OCws7LfffkNgKQGMsEDJlZaWWllZNTQ05OXl2djYyLs70CFY1gBKztzcfMqUKSKRKCoqSt59gY7CCAuU36VLl0aPHm1lZcXhcNTV1eXdHWg/jLBA+Y0aNcrZ2ZnL5Z4+fVrefYEOQWCBSliwYAG2rFECuCQEFSrUKlm9ZWdnJ+/uQDthhAUqwdjYeNq0aWKxODw8XN59gfbDCAtURUpKipeXl4WFRWFhoaampry7A+2BERaoiuHDh7u6upaUlJw8eVLefYF2QmCBClm8eDGm3hkNl4SgQioqKthsNp/P//PPP52cnOTdHWgzjLBAhRgZGc2cOZOiKEy9MxRGWKBabt68OXjwYHNzcy6Xq62tLe/uQNtghAWqZdCgQf379y8tLT1x4oS8+wJthsAClbNo0SJMvTMULglB5VRXV7PZ7MrKyszMTGdnZ3l3B9oAIyxQOfr6+rNmzSKEYOqdcTDCAlWUkZHh5uZmbGxcXFyMQq0MghEWqKJ+/foNGDCgvLx8xYoVQqGwI00JBIKIiIjo6GhCSGZmJqbGZAp7uoOK6tu37+3bt/fs2XPgwAFra2v7v3J2dtbV1X2ddnR0dIYPHx4TEyMWi8+cOYMNAmUKgQUqaseOHU+fPk1MTKyvr29ZqFVDQ8PGxkZSpbWpXKuDg0OrS7csLS3r6upOnTqlrq6ekZFRX1/faj1X6DjMYYEKoSgqKiqKx+PNmTMnPj5eS0tr7ty59fX1XC63qTKYpFxrQUFBqyUOTUxMJOOvplqtpqamiYmJ2trac+fOJYScPn168uTJ8vhyKgGBBaqlsLAwMDBw+/btN27cuHbt2tq1a1s9rK6uLj8/v6lQq6RWK4fDaWhoaHkwCrV2GvwrBtVibm5uY2Pz9OnTwYMHnzt37u8O09bW7tWo+ZsNDQ2FhYUtC7VeuHBh9+7dK1eu7JRvoNIQWKBCKIr69ddfTU1NtbS09u7d6+Xl1aaPa2ho/F2h1kuXLiGwOgEuCQE65MmTJzY2NmKxmMPhsNlseXdHyWEdFkCHdO/e/b333mtoaIiMjJR3X5QfRlgAHXXhwoWxY8daW1vn5+djHZZMYYQF0FGjR4/u2bNnUVHR2bNn5d0XJYfAAugoFouFLWs6By4JAaSAx+NZWVlJFs3b2trKuztKCyMsACkwMzOTFGrF1LtMYYQFIB3Jycne3t6WlpYFBQUo1CojGGEBSMeIESNcXFweP358+vRpefdFaSGwAKQGU++yhktCAKkpLy9ns9m1tbXZ2dmOjo7y7o4SwggLQGqMjY19fHwoisLUu4xghAUgTdevXx86dGjXrl2LiopQqFXqMMICkCZPT093d/dnz57FxcXJuy9KCIEFIGULFy7E1LuM4JIQQMqqqqrYbHZ1dfWDBw969uwp7+4oFYywAKTMwMBg5syZFEWFhYXJuy/KBiMsAOlLT093d3c3MzPjcrk6Ojry7o7ywAgLQPrc3Nzc3d15PN6yZcvq6uo6Xunn559/FgqFP//8c2JiIlFhGGEByMTcuXP37t3bVOLwlUKtvXr10tPTe82mioqKdu3aNWzYsPT09IkTJw4cOJCoKhShAJCJnTt3Pn/+PDExUSgUtizUqqamZm1t/UqhVkdHx1bLTZubm+vq6lZVVU2cODEhIUGVAwsjLACpkaxx5/F4ixYt+v7778eNGzd69OhXCrVKarVmZ2e3WuJQUqhVwrlR9+7dz507V19fHxAQsGfPHmtr6/fee4+oKgQWgDQ9fPgwODh41apV0dHRo0aNGjlyZKuHCYVCDofTvFBrTk4Oh8Opr69vefDo0aPPnz+vpoYZZwQWgFTV1NT8/PPPc+fONTMz27hx44YNG17/syKRqLCwsKnQdE5Ozt27dyXXkiEhIYsXL5Zlx5kBgQUgNWKx+NChQ2KxeMKECSdOnHB1dR06dGgH21y0aFF4eLiPj8+RI0ek1E0GQ2ABKLSSkhIbGxuKogoKCt544w2i2nBVDKDQLCws3nnnnYaGhj179si7L/KHwAJQdEuWLJFMY4lEIqLaEFgAim7cuHFOTk6FhYUqvswdgQXAACwWa8GCBdiyBpPuAMxQWlpqZWXV0NCQl5dnY2NDVBVGWAAMYG5uPmXKFJFIFBUVRVQYRlgAzHDp0qXRo0dbWVlxOBx1dXWikjDCAmCGUaNGOTs7c7lcVS7UisACYIwFKj/1jktCAIYVahUIBLm5uXZ2dkT1YIQFwBjGxsbTpk0Ti8Xh4eFEJWGEBcAkKSkpXl5eFhYWhYWFmpqaRMVghAXAJMOHD3d1dS0pKTl58iRRPQgsAIZZ3LgxlmpOveOSEIBhKioq2Gw2n8//888/nZyciCrBCAuAYYyMjGbMmEFRlApOvWOEBcA8qampQ4YMMTc353K52traRGVghAXAPIMHD3ZzcystLV2yZEkHC7UyC0ZYAIzk5+d34MAByWsTExNnZ2cXF5emEmFvvvmmvr4+UToILABGqqqqmjNnTlJSUn19fcudSFkslpWV1StVWh0cHJieYggsAMagKCoqKurp06dTpkxJTk4uKiratGmTUCgsKipqXqg1Ly8vKyurtrb2nwu1Smq1urq6GhsbE4ZAYAEwCYfDCQwM3LFjB4/HO3v2rL+//98d+ejRo6ysrOYp9uDBg5qampZHlpeXGxkZESbQkHcHAKANzMzMbG1tKysrjxw5Mn/+/H848o1Gzd8Ri8VFRUVNVVolKisrmZJWCCwAJhGLxadPn7awsDA2Nu7Tp4+urm6bPq6mpmbbaMyYMYSZcEkIoIooioqMjKyoqPD39z906JCRkVFAQABReFiHBaCKWCzW2LFji4uLBQJBdnZ2SUkJYQIEFoCK6t69u7m5+Z07d2bMmMHj8QgTILAAVJFIJDp8+LCFhYW3t3dmZubQoUMJE2AOCwAYAyMsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAxkBgAQBjILAAgDEQWADAGAgsAGAMBBYAMAYCCwAYA4EFAIyBwAIAwhT/B92rjnR5LP/iAAAAAElFTkSuQmCC",
            "text/plain": [
              "<PIL.PngImagePlugin.PngImageFile image mode=RGB size=400x400>"
            ]
          },
          "execution_count": 30,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "Draw.MolToImage(\n",
        "    show_atom_number(rdMolEnumerator.Enumerate(mol_to_combine)[0]),\n",
        "    size=(400, 400),\n",
        ")\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xY7IpHxVzf1C"
      },
      "source": [
        "### 4. Generate all molecules and label them"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 49,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "id": "w8uJkR8_zf1D",
        "outputId": "81778069-f882-48cf-d897-a215b5de4613"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Generating the No.0 molecule\n",
            "Generating the No.10 molecule\n",
            "Generating the No.20 molecule\n",
            "Generating the No.30 molecule\n",
            "Generating the No.40 molecule\n",
            "Generating the No.50 molecule\n",
            "Generating the No.60 molecule\n",
            "Generating the No.70 molecule\n",
            "Generating the No.80 molecule\n",
            "Generating the No.90 molecule\n",
            "Generating the No.100 molecule\n",
            "Generating the No.110 molecule\n",
            "Generating the No.120 molecule\n",
            "Generating the No.130 molecule\n",
            "Generating the No.140 molecule\n",
            "Generating the No.150 molecule\n",
            "Generating the No.160 molecule\n",
            "Generating the No.170 molecule\n",
            "Generating the No.180 molecule\n",
            "Generating the No.190 molecule\n",
            "Generating the No.200 molecule\n",
            "Generating the No.210 molecule\n",
            "Generating the No.220 molecule\n",
            "Generating the No.230 molecule\n",
            "Generating the No.240 molecule\n",
            "Generating the No.250 molecule\n",
            "Generating the No.260 molecule\n",
            "Generating the No.270 molecule\n",
            "Generating the No.280 molecule\n",
            "Generating the No.290 molecule\n",
            "Generating the No.300 molecule\n",
            "Generating the No.310 molecule\n",
            "Generating the No.320 molecule\n",
            "Generating the No.330 molecule\n",
            "Generating the No.340 molecule\n",
            "Generating the No.350 molecule\n",
            "Generating the No.360 molecule\n",
            "Generating the No.370 molecule\n",
            "Generating the No.380 molecule\n",
            "Generating the No.390 molecule\n",
            "Generating the No.400 molecule\n",
            "Generating the No.410 molecule\n",
            "Generating the No.420 molecule\n",
            "Generating the No.430 molecule\n",
            "Generating the No.440 molecule\n",
            "Generating the No.450 molecule\n",
            "Generating the No.460 molecule\n",
            "Generating the No.470 molecule\n",
            "Generating the No.480 molecule\n",
            "Generating the No.490 molecule\n",
            "Generating the No.500 molecule\n",
            "Generating the No.510 molecule\n",
            "Generating the No.520 molecule\n",
            "Generating the No.530 molecule\n",
            "Generating the No.540 molecule\n",
            "Generating the No.550 molecule\n",
            "Generating the No.560 molecule\n",
            "Generating the No.570 molecule\n",
            "Generating the No.580 molecule\n",
            "Generating the No.590 molecule\n",
            "Generating the No.600 molecule\n",
            "Generating the No.610 molecule\n",
            "Generating the No.620 molecule\n",
            "Generating the No.630 molecule\n",
            "Generating the No.640 molecule\n",
            "Generating the No.650 molecule\n",
            "Generating the No.660 molecule\n",
            "Generating the No.670 molecule\n",
            "Generating the No.680 molecule\n",
            "Generating the No.690 molecule\n",
            "Generating the No.700 molecule\n",
            "Generating the No.710 molecule\n",
            "Generating the No.720 molecule\n",
            "Generating the No.730 molecule\n",
            "Generating the No.740 molecule\n",
            "Generating the No.750 molecule\n",
            "Generating the No.760 molecule\n",
            "Generating the No.770 molecule\n",
            "Generating the No.780 molecule\n",
            "Generating the No.790 molecule\n",
            "Generating the No.800 molecule\n",
            "Generating the No.810 molecule\n",
            "Generating the No.820 molecule\n",
            "Generating the No.830 molecule\n",
            "Generating the No.840 molecule\n",
            "Generating the No.850 molecule\n",
            "Generating the No.860 molecule\n",
            "Generating the No.870 molecule\n",
            "Generating the No.880 molecule\n",
            "Generating the No.890 molecule\n",
            "Generating the No.900 molecule\n",
            "Generating the No.910 molecule\n",
            "Generating the No.920 molecule\n",
            "Generating the No.930 molecule\n",
            "Generating the No.940 molecule\n",
            "Generating the No.950 molecule\n",
            "Generating the No.960 molecule\n",
            "Generating the No.970 molecule\n",
            "Generating the No.980 molecule\n",
            "Generating the No.990 molecule\n",
            "Generating the No.1000 molecule\n",
            "Generating the No.1010 molecule\n",
            "Generating the No.1020 molecule\n",
            "Generating the No.1030 molecule\n",
            "Generating the No.1040 molecule\n",
            "Generating the No.1050 molecule\n",
            "Generating the No.1060 molecule\n",
            "Generating the No.1070 molecule\n",
            "Generating the No.1080 molecule\n",
            "Generating the No.1090 molecule\n",
            "Generating the No.1100 molecule\n",
            "Generating the No.1110 molecule\n",
            "Generating the No.1120 molecule\n",
            "Generating the No.1130 molecule\n",
            "Generating the No.1140 molecule\n",
            "Generating the No.1150 molecule\n",
            "Generating the No.1160 molecule\n",
            "Generating the No.1170 molecule\n",
            "Generating the No.1180 molecule\n",
            "Generating the No.1190 molecule\n"
          ]
        }
      ],
      "source": [
        "combined_mols = []\n",
        "count = 0\n",
        "for i in range(len(mol_df_A)):\n",
        "    test_A_mol, raw_A_mol, test_A_smiles, raw_A_smiles = mol_df_A.iloc[i][\n",
        "        [\"main_compoent\", \"mol\", \"main_compoent_SMILES\", \"SMILES\"]\n",
        "    ]\n",
        "    for j in range(len(mol_df_B)):\n",
        "        test_B_mol, raw_B_mol, test_B_smiles, raw_B_smiles = mol_df_B.iloc[j][\n",
        "            [\"main_component\", \"mol\", \"main_component_SMILES\", \"SMILES\"]\n",
        "        ]\n",
        "        for k in range(len(mol_df_C)):\n",
        "            test_C_mol, raw_C_mol, test_C_smiles, raw_C_smiles = mol_df_C.iloc[k][\n",
        "                [\"main_component\", \"mol\", \"main_component_SMILES\", \"SMILES\"]\n",
        "            ]\n",
        "            if count % 10 == 0:\n",
        "                print(f\"Generating the No.{count} molecule\")\n",
        "\n",
        "            assert test_A_smiles.startswith(\"*\")\n",
        "            assert test_B_smiles.startswith(\"*\")\n",
        "            assert test_C_smiles.startswith(\"*\")\n",
        "\n",
        "            star_of_A_pos = core_num_atoms\n",
        "            star_of_B_pos = core_num_atoms + test_A_mol.GetNumAtoms()\n",
        "            star_of_C_pos = (\n",
        "                core_num_atoms + test_A_mol.GetNumAtoms() + test_B_mol.GetNumAtoms()\n",
        "            )\n",
        "\n",
        "            mol_to_combine = Chem.MolFromSmiles(\n",
        "                f\"{core_smiles}.{test_A_smiles}.{test_B_smiles}.{test_C_smiles} |m:{star_of_A_pos}:{core_A_pos},{star_of_B_pos}:{core_B_pos},{star_of_C_pos}:{core_C_pos}|\"\n",
        "            )\n",
        "\n",
        "            enumerated = rdMolEnumerator.Enumerate(mol_to_combine)\n",
        "            assert len(enumerated) == 1\n",
        "            combined_mol = enumerated[0]\n",
        "\n",
        "            # Generate the label\n",
        "            label = f\"A{i+1}B{j+1}C{k+1}\"\n",
        "\n",
        "            results = {}\n",
        "            results[\"id\"] = count\n",
        "            results[\"label\"] = label  # Add the label\n",
        "            results[\"combined_mol\"] = combined_mol\n",
        "            results[\"combined_mol_SMILES\"] = Chem.MolToSmiles(combined_mol)\n",
        "\n",
        "            results[\"A\"] = raw_A_mol\n",
        "            results[\"A_smiles\"] = raw_A_smiles\n",
        "            results[\"B\"] = raw_B_mol\n",
        "            results[\"B_smiles\"] = raw_B_smiles\n",
        "            results[\"C\"] = raw_C_mol\n",
        "            results[\"C_smiles\"] = raw_C_smiles\n",
        "\n",
        "            combined_mols.append(results)\n",
        "            count = count + 1"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 50,
      "metadata": {
        "id": "x1_0xd3vzf1D"
      },
      "outputs": [],
      "source": [
        "combined_df = pd.DataFrame(combined_mols)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 51,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 712
        },
        "id": "r1fEoZU6zf1D",
        "outputId": "2593d96a-aa29-4373-cb55-7d7a811cbf5f"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>label</th>\n",
              "      <th>combined_mol</th>\n",
              "      <th>combined_mol_SMILES</th>\n",
              "      <th>A</th>\n",
              "      <th>A_smiles</th>\n",
              "      <th>B</th>\n",
              "      <th>B_smiles</th>\n",
              "      <th>C</th>\n",
              "      <th>C_smiles</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>A1B1C1</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCCCCNC(=O)C(CCCCCOC(=O)CCCCCCCC)NCCN(C)C</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CN(C)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>O=C(CCCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>A1B1C2</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCCCCCCNC(=O)C(CCCCCOC(=O)CCCCCCCC)NCCN...</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CN(C)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>O=C(CCCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>A1B1C3</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCCCCCCCCNC(=O)C(CCCCCOC(=O)CCCCCCCC)NC...</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CN(C)CCN</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>O=C(CCCCCCCC)OCCCCCC=O</td>\n",
              "      <td style=\"text-align: center;\"><div style=\"width: 200px; height: 200px\" data-content=\"rdkit/molecule\"><img src=\"\" alt=\"Mol\"/></div></td>\n",
              "      <td>CCCCCCCCCCCCCCCC[N+]#[C-]</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "   id   label                                   combined_mol  \\\n",
              "0   0  A1B1C1  <rdkit.Chem.rdchem.Mol object at 0x12a1d8a50>   \n",
              "1   1  A1B1C2  <rdkit.Chem.rdchem.Mol object at 0x12a1d8040>   \n",
              "2   2  A1B1C3  <rdkit.Chem.rdchem.Mol object at 0x12a1833e0>   \n",
              "\n",
              "                                 combined_mol_SMILES  \\\n",
              "0   CCCCCCCCCCCCNC(=O)C(CCCCCOC(=O)CCCCCCCC)NCCN(C)C   \n",
              "1  CCCCCCCCCCCCCCNC(=O)C(CCCCCOC(=O)CCCCCCCC)NCCN...   \n",
              "2  CCCCCCCCCCCCCCCCNC(=O)C(CCCCCOC(=O)CCCCCCCC)NC...   \n",
              "\n",
              "                                               A  A_smiles  \\\n",
              "0  <rdkit.Chem.rdchem.Mol object at 0x11fc818c0>  CN(C)CCN   \n",
              "1  <rdkit.Chem.rdchem.Mol object at 0x11fc818c0>  CN(C)CCN   \n",
              "2  <rdkit.Chem.rdchem.Mol object at 0x11fc818c0>  CN(C)CCN   \n",
              "\n",
              "                                               B                B_smiles  \\\n",
              "0  <rdkit.Chem.rdchem.Mol object at 0x11fc821f0>  O=C(CCCCCCCC)OCCCCCC=O   \n",
              "1  <rdkit.Chem.rdchem.Mol object at 0x11fc821f0>  O=C(CCCCCCCC)OCCCCCC=O   \n",
              "2  <rdkit.Chem.rdchem.Mol object at 0x11fc821f0>  O=C(CCCCCCCC)OCCCCCC=O   \n",
              "\n",
              "                                               C                   C_smiles  \n",
              "0  <rdkit.Chem.rdchem.Mol object at 0x11fc82730>      CCCCCCCCCCCC[N+]#[C-]  \n",
              "1  <rdkit.Chem.rdchem.Mol object at 0x11fc827a0>    CCCCCCCCCCCCCC[N+]#[C-]  \n",
              "2  <rdkit.Chem.rdchem.Mol object at 0x11fc82810>  CCCCCCCCCCCCCCCC[N+]#[C-]  "
            ]
          },
          "execution_count": 51,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "combined_df.head(3)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BxE5vZch8C6p"
      },
      "source": [
        "- **The path in the bracket need to be changed as you need to store the output of the generation**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 52,
      "metadata": {
        "id": "bGXwGbCvzf1E"
      },
      "outputs": [],
      "source": [
        "# only include the columns that can be saved in csv before saving, including the label\n",
        "combined_df[\n",
        "    [\"id\", \"label\", \"combined_mol_SMILES\", \"A_smiles\", \"B_smiles\", \"C_smiles\"]\n",
        "].to_csv(\"/Users/yuexu/Documents/GitHub/Agiledataset/AGILE_1200_SMILES_RDkit.csv\", index=False)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mh1I6IZbzf1E"
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "agile",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.16"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
